SlideShare a Scribd company logo
1 of 31
Download to read offline
Online Bitrate ladder prediction for Adaptive VVC
Streaming
Vignesh V Menon
Postdoctoral Researcher
Fraunhofer HHI
#SEGMENTS2024
CONTENT
Organized by
FEBRUARY 14, 2024 | DENVER MARRIOTT TECH CENTER
Organized by
Introduction to per-title encoding
#SEGMENTS2024
Introduction to per-title encoding
3 CONTENT
[1] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP –: Standards and Design Principles,” in Proceedings of the Second Annual ACM Conference on Multimedia Systems,
2011, p. 133–144.
Context- HTTP Adaptive Streaming (HAS)
HAS divides the video content into segments and encodes each segment at various representations, stored
in plain HTTP servers, which continuously adapt the video delivery to the network conditions and device
capabilities of the client [1].
Figure: HAS [1] concept.
Introduction to per-title encoding
4 CONTENT
[2] B. Bross, Y. Wang, Y. Ye, S. Liu, J. Chen, G. Sullivan, and J. Ohm. (2021). Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on
Circuits and Systems for Video Technology. 31. 3736-3764. 10.1109/TCSVT.2021.3101953.
[3] R. Kaafarani et al., “Evaluation Of Bitrate Ladders For Versatile Video Coder,” in 2021 International Conference on Visual Communications and Image Processing (VCIP), 2021,
pp. 1–5.
[4] A. Aaron, Z. Li, M. Manohara, J.D. Cock, D. Ronca, "Per-title encode optimization." The Netflix Techblog (2015).
Motivation for online convex-hull estimation
● The convex hull is where the encoding point achieves
“Pareto efficiency”.
● Online convex-hull estimation methods provide a dynamic
and adaptive means to optimize bitrate and resolution
selections.
● By dynamically adjusting the bitrate-resolution pairs in
response to the video content complexity and coding
algorithms, these methods achieve an optimal trade-off
between computational efficiency and visual fidelity in the
face of the increased intricacies associated with advanced
codecs like VVC [2,3].
Figure: Conceptual plot to depict the bitrate-quality relationship for
any video source encoded at different resolutions. Source: [4]
Introduction to per-title encoding
5 CONTENT
Figure: Rate-distortion (RD) and rate-encoding time curves of representative sequences (segments) of Inter-4K
dataset encoded at 540p, 1080p and 2160p resolutions using VVenC at faster preset. Here, XPSNR is used as
the quality metric.
Quality versus encoding time trade-off
● Dynamic resolution encoding is the most
extensively studied per-title encoding
scheme in adaptive streaming
applications, focusing on adjusting
encoding resolutions dynamically to
optimize video quality [5, 6, 7, 8].
● Encoding time depends on the encoding
resolution chosen for the video content.
The number of pixels in each frame
significantly impacts the computational
workload.
[5] J. Cock, Zhi Li, M. Manohara, and A. Aaron, “Complexity-based consistent-quality encoding in the cloud,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016,
pp. 1484–1488.
[6] A. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in 2019 Picture Coding Symposium (PCS), 2019, pp. 1–5.
[7] M. Bhat, J. Thiesse, and P. Le Callet, “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study,” in 2021 IEEE
International Conference on Image Processing (ICIP), 2021, pp. 2164–2168.
[8] A. Zabrovskiy, P. Agrawal, C. Timmerer, and R. Prodan, “FAUST: Fast Per-Scene Encoding Using Entropy-Based Scene Detection and Machine Learning,” in 2021 30th Conference
of Open Innovations Association FRUCT, 2021, pp. 292–302.
Introduction to per-title encoding
6 CONTENT
● Reducing encoding energy consumption (in data centers)
is critical in streaming applications since it contributes to
environmental sustainability [12].
● The streaming industry can reduce its carbon footprint and
energy consumption by minimizing encoding time.
● Our prior experiments suggest a pseudo linear relationship
between encoding time and encoding energy
consumption.
Reducing encoding time- “greenifies” streaming
Figure: Average encoding metrics for 7.5 fps, 15 fps, 24 fps, and 30 fps HLS CBR
encoding using the veryslow preset of the x264 [9] AVC [10] encoder. Source: [11]
Figure: Average encoding metrics for HLS CBR encoding at 30 fps using selected presets
of x264 [9] AVC [10] encoder. Source: [11]
[9] VideoLAN, “x264.” [Online]. Available: https://www.videolan.org/developers/x264.html
[10] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” in IEEE Transactions on Circuits and Systems for Video
Technology, vol. 13, no. 7, 2003, pp. 560–576.
[11] Vignesh V. Menon, Samira Afzal, Prajit T. Rajendran, Klaus Schoeffmann, Radu Prodan, and Christian Timmerer. [n. d.]. Content-Adaptive Variable Framerate Encoding
Scheme for Green Live Streaming, [Online]. Available: http://arxiv.org/abs/2311.08074
[12] A. Stephens, C. Tremlett-Williams, L. Fitzpatrick, L. Acerini, M. Anderson, and N. Crabbendam, “The Carbon Impacts of Video Streaming,” Jun. 2021.
Organized by
XPSNR or VMAF in VVC?
#SEGMENTS2024
XPSNR or VMAF in VVC?
8 CONTENT
[13] C. R. Helmrich et al., “Information on and analysis of the VVC encoders in the SDR UHD verification test,” in WG 05 MPEG Joint Video Coding Team(s) with ITU-T SG 16,
document JVET-T0103, Oct. 2020.
[14] M. Wien and V. Baroncini, “Report on VVC compression performance verification testing in the SDR UHD Random Access Category,” in WG 05 MPEG Joint Video Coding
Team(s) with ITU-T SG 16, document JVET-T0097, Oct. 2020.
[15] C. R. Helmrich et al., “A study of the extended perceptually weighted peak signal-to-noise ratio (XPSNR) for video compression with different resolutions and bit depths,” in ITU
Journal: ICT Discoveries, vol. 3, May 2020. [Online] http://handle.itu.int/11.1002/pub/8153d78b-en
● Traditional convex-hull estimation methods use Video
Multimethod Assessment Fusion (VMAF) as the perceptual
quality metric.
● However, as observed in [13], Peak Signal to Noise Ratio
(PSNR) and VMAF measures fail to model the subjective
quality of VVC-coded bitstreams accurately.
● It was observed that Structural Similarity Index (SSIM),
Multi-Scale Structural Similarity Index (MS-SSIM), and
eXtended Peak Signal-to-Noise Ratio (XPSNR) can predict
the subjective codec ranking reported in [14] with
acceptable accuracy [15].
Table: – Evaluation results for Spearman rank order correlation
with MOS values. Source: [15]
Organized by
Online Convex-hull using XPSNR
#SEGMENTS2024
Online Convex-hull using XPSNR (VEXUS)
10 CONTENT
Spatiotemporal complexity feature extraction
11 CONTENT
We use seven DCT-energy-based features extracted using Video Complexity Analyzer (VCA) [16]:
● average texture energy (EY),
● average gradient of the luma texture energy (h)
● average luma brightness (LY),
● average chroma texture energy of U and V channels (EU and EV)
● average chroma brightness of U and V channels (LU and LV) [17].
[16] V. V. Menon, C. Feldmann, K. Schoeffmann, M. Ghanbari, and C. Timmerer, “Green Video Complexity Analysis for Efficient Encoding in Adaptive Video Streaming,” in First
International ACM Green Multimedia Systems Workshop (GMSys ’23), 2023.
[17] V. V. Menon, P. T. Rajendran, C. Feldmann, K. Schoeffmann, M. Ghanbari and C. Timmerer, "JND-Aware Two-Pass Per-Title Encoding Scheme for Adaptive Live Streaming," in IEEE
Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 1281-1294, Feb. 2024, doi: 10.1109/TCSVT.2023.3290725.
XPSNR-optimized resolution prediction
12 CONTENT
The objective of selecting the optimized resolution based on bitrate and video complexity features is
decomposed into two parts:
Modeling:
XPSNR of a video scene encoded at resolution r and bitrate b, i.e., x(r,b) is modeled as a function of video
complexity features, b, and normalized resolution height r‘ = r/2160
Resolution optimization:
Select the resolution that maximizes the predicted XPSNR.
We trained XGBoost [18] model hyperparameter tuned to predict XPSNR.
● max_depth=10, and n_estimators=400
[18] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Aug. 2016, pp. 785–794.
Optimized QP prediction
13 CONTENT
Modeling:
The optimized QP is modeled as a function of spatiotemporal features, target bitrate b, and normalized
resolution height r‘ as:
● For applications such as streaming, avoiding exceeding the maximum bitrates specified in the
HLS/DASH manifests [19, 20] during the encoding process is essential.
● Failure to adhere to these limits can lead to buffer overflows or underflows in video players.
[19] I. Sodagar, “The MPEG-DASH Standard for Multimedia Streaming Over the Internet,” IEEE MultiMedia, vol. 18, no. 4, pp. 62–67, 2011.
[20] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann, “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP,” vol, 21, no. 1, IEEE
Communications Surveys Tutorials, pp. 562–585, 2019.
QP optimization:
The optimization function aims to predict the QP, minimizing the discrepancy between the predicted and
target bitrate for a given resolution.
Optimized QP estimation
14 CONTENT
Cascaded approach
● This method involves training distinct XGboost regression models for minimum and maximum QP
values (qmin and qmax, respectively).
● The optimized for a target bitrate b is determined using linear regression, as follows:
● The equation captures the non-linear relationship between bitrate
and QP by employing a logarithmic mapping of the bitrate values.
● VVenC implemented capped VBR ratecontrol in Jan 2024 release
[21], the QP is specified using the qp option, while the maxrate
(easy mode) or MaxBitrate (expert mode) option is used to specify
the upper bound of bitrate variability.
Figure: QP versus normalized bitrate (in log
scale) for a representative video segment.
[21] C. Helmrich, V. George, V. V. Menon, A. Wieckowski, B. Bross, and D. Marpe, “Fast constant-quality video encoding using VVenC with rate capping based on pre-analysis
statistics”, 2024.
Organized by
Experimental design
#SEGMENTS2024
Experimental design
16 CONTENT
Dataset generation
Figure: Calculation of the groundtruth PSNR, XPSNR, and bitrate to train the prediction
models. This example shows the ground truth calculation of a video encoded at 1080p with qp
30.
● We used 1000 videos of the Inter-4K dataset [22] to
validate the performance of the encoding methods.
● We encoded the sequences at UHD (2160p) 60fps
using VVenC v1.10 [23] using preset 0 (faster).
● We extracted the spatiotemporal features using
VCA v2.0.
● We ran constant quality encoding by varying qp
values from qmin to qmax for each resolution.
● We computed full-reference PSNR and XPSNR
quality metrics after the compressed video was
upscaled to the original resolution (2160p).
[22] A. Stergiou and R. Poppe, “AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling,” in IEEE Transactions on Image Processing, vol. 32, 2023, pp.
251–266.
[23] A. Więckowski, J. Brandenburg, T. Hinz, C. Bartnik, V. George, G. Hege, C. Helmrich, A. Henkel, C. Lehmann, C. Stoffers, I. Zupancic, B. Bross, and D. Marpe, “VVenC: An
Open And Optimized VVC Encoder Implementation,” in Proc. IEEE International Conference on Multimedia Expo Workshops (ICMEW), pp. 1–2.
Experimental design
17 CONTENT
1. Default: This method employs a fixed resolution encoding, i.e., all
bitstreams are encoded at the exact resolution as the input video.
2. FixedLadder: This method employs a fixed set of bitrate-resolution
pairs. We use the HLS bitrate ladder specified in the Apple authoring
specifications [24] as the fixed set of bitrate-resolution pairs.
3. Bruteforce: This method determines optimized resolution, which
yields the highest XPSNR for a given target bitrate after an
exhaustive encoding process at all supported resolutions and QPs.
Benchmarks Table: An example fixed bitrate-ladder, i.e., set of
bitrate-resolution pairs. Source: [24].
[24] Apple Inc., “HLS Authoring Specification for Apple Devices.” [Online]. Available:
https://developer.apple.com/documentation/http-live-streaming/hls-authoring-specification-for-apple-devices
Table: Experimental parameters used for evaluation.
Organized by
Evaluation results
#SEGMENTS2024
Speed and accuracy
19 CONTENT
● Speed of feature extraction: 176 fps
● XPSNR prediction
○ MAE: 0.17 dB, R2: 0.99
○ Std. dev: 0.22 dB
● QP prediction
○ MAE: 1.32, R2: 0.97
○ Std. dev: 1.96
Figure: Prediction results of XPSNR and QP prediction models.
Rate-distortion results
20 CONTENT
RD curve of VEXUS closely mirrors the Bruteforce method, indicating the effectiveness of its predictive
modeling in approximating optimized resolutions and QPs.
Figure: RD curves of representative video sequences using Default (green line), FixedLadder (blue line), Bruteforce (black line), and VEXUS (red line) encodings.
Encoding and decoding times
21 CONTENT
● Encoding and decoding times are reduced for lower bitrates, as encoding and decoding operations
become less complex due to lower resolutions.
Figure: Encoding and decoding times of representative video sequences using Default (green line), FixedLadder (blue line), Bruteforce (black line), and VEXUS (red line) encodings.
Result summary
22 CONTENT
● Coding efficiency (in terms of Bjøntegaard Delta [25] rates), encoding and decoding times decrease
as rmax decreases.
● The trade-off between quality and coding efficiency is based on the target audience, delivery platform,
and available resources.
[25] HSTP-VID-WPOM, “Working practices using objective metrics for evaluation of video coding efficiency experiments,” International Telecommunication Union, 2020. [Online].
Available: http://handle.itu.int/11.1002/pub/8160e8da-en
Table: Average results of the encoding schemes compared to Default encoding.
Organized by
Encoding time constrained
convex-hull estimation
#SEGMENTS2024
Encoding time constrained convex-hull estimation
24 CONTENT
Resolution optimization:
● Select the resolution that maximizes the predicted XPSNR, constrained on the maximum encoding time [26].
● Encoding time is predicted using the same approach as the QP prediction.
● Encoding time constraint is effectively a constraint on the encoding energy.
[26] V. V. Menon, A. Premkumar, P. T. Rajendran, A. Więckowski, B. Bross, C. Timmerer, and D. Marpe, "Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic
Resolution Encoding." 2024 Mile High Video (MHV), doi: 10.1145/3638036.3640801.
Encoding time constrained convex-hull estimation
25 CONTENT
Figure: RD curves, and encoding times of representative video sequences (segments) using VEXUS (𝑟max = 2160).
Source: [27]
Results
RD performance decreases as the encoding time constraint is lowered.
Figure: Selected encoding resolutions of representative
video sequences (segments) using VEXUS (𝑟max = 2160).
Source: [27]
[27] A. Premkumar, P. T. Rajendran, V. V. Menon, A. Więckowski, B. Bross, and D. Marpe, "Quality-Aware Dynamic Resolution Adaptation Framework for Adaptive Video
Streaming,” 2024 [Online]. Available: https://github.com/PhoenixVideo/QADRA
Organized by
Summary
#SEGMENTS2024
Summary
27 CONTENT
● XPSNR demonstrates a better correlation with subjective quality scores for VVC-coded UHD content.
● Leveraging this insight, we introduced an approach where XPSNR is predicted for VVC-coded
bitstreams using spatiotemporal complexity features of the video and the target encoding
configuration.
● We proposed VEXUS, where the convex-hull is estimated online using the predicted XPSNR.
● On average, VEXUS yields a substantial improvement of 5.84 dB in PSNR and 0.62 dB in XPSNR for
the same bitrates compared to the conventional UHD encoding with the VVenC encoder, followed by a
44.43% reduction in overall encoding time, and a 65.46% reduction in overall decoding time using
VTM decoder.
● We also discussed introducing an encoding time constraint in the convex-hull estimation process, and
analyzed its impact on RD performance.
Organized by
Reproducibility
#SEGMENTS2024
Open-source tools
29 CONTENT
1. VVC encoder: Fraunhofer Versatile Video Encoder (VVenC) v1.10
Available: https://github.com/fraunhoferhhi/vvenc
2. VVC decoder: VTM reference decoder v22.0
Available: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM
3. Spatiotemporal feature extractor: Video Complexity Analyzer (VCA) v2.0
Available: https://github.com/cd-athena/VCA
4. Convex-hull estimation framework:
Available: https://github.com/PhoenixVideo/QADRA
Organized by
Questions?
Get in touch.
Vignesh V Menon
vignesh.menon@hhi.fraunhofer.de
#SEGMENTS2024
Organized by
Thank you!
#SEGMENTS2024

More Related Content

Similar to Online Bitrate ladder prediction for Adaptive VVC Streaming

Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?Alpen-Adria-Universität
 
Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...IJECEIAES
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfVignesh V Menon
 
Content-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive StreamingContent-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive StreamingAlpen-Adria-Universität
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
 
An Overview on Multimedia Transcoding Techniques on Streaming Digital Contents
An Overview on Multimedia Transcoding Techniques on Streaming Digital ContentsAn Overview on Multimedia Transcoding Techniques on Streaming Digital Contents
An Overview on Multimedia Transcoding Techniques on Streaming Digital Contentsidescitation
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
 
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...Analyzing Video Streaming Quality by Using Various Error Correction Methods o...
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...IJERA Editor
 
Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...IJECEIAES
 
Multi-View Video Coding Algorithms/Techniques: A Comprehensive Study
Multi-View Video Coding Algorithms/Techniques: A Comprehensive StudyMulti-View Video Coding Algorithms/Techniques: A Comprehensive Study
Multi-View Video Coding Algorithms/Techniques: A Comprehensive StudyIJERA Editor
 
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...Raoul Monnier
 
HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)Alpen-Adria-Universität
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches ijsc
 

Similar to Online Bitrate ladder prediction for Adaptive VVC Streaming (20)

Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?
 
HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?
 
40120140504006
4012014050400640120140504006
40120140504006
 
Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
 
Content-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive StreamingContent-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive Streaming
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
 
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
 
An Overview on Multimedia Transcoding Techniques on Streaming Digital Contents
An Overview on Multimedia Transcoding Techniques on Streaming Digital ContentsAn Overview on Multimedia Transcoding Techniques on Streaming Digital Contents
An Overview on Multimedia Transcoding Techniques on Streaming Digital Contents
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
 
A04840107
A04840107A04840107
A04840107
 
MMM_MCOM_Live.pdf
MMM_MCOM_Live.pdfMMM_MCOM_Live.pdf
MMM_MCOM_Live.pdf
 
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...Analyzing Video Streaming Quality by Using Various Error Correction Methods o...
Analyzing Video Streaming Quality by Using Various Error Correction Methods o...
 
Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...
 
Multi-View Video Coding Algorithms/Techniques: A Comprehensive Study
Multi-View Video Coding Algorithms/Techniques: A Comprehensive StudyMulti-View Video Coding Algorithms/Techniques: A Comprehensive Study
Multi-View Video Coding Algorithms/Techniques: A Comprehensive Study
 
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...
 
HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches
 

More from Vignesh V Menon

Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Vignesh V Menon
 
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Vignesh V Menon
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live StreamingVignesh V Menon
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdfVignesh V Menon
 
LiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdfLiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdfVignesh V Menon
 
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfVignesh V Menon
 
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdfOPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdfVignesh V Menon
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfVignesh V Menon
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfVignesh V Menon
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVignesh V Menon
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingVignesh V Menon
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCVignesh V Menon
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...Vignesh V Menon
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...Vignesh V Menon
 

More from Vignesh V Menon (17)

Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live Streaming
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdf
 
TQPM.pdf
TQPM.pdfTQPM.pdf
TQPM.pdf
 
CAPS_Presentation.pdf
CAPS_Presentation.pdfCAPS_Presentation.pdf
CAPS_Presentation.pdf
 
LiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdfLiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdf
 
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
 
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdfOPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf
OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdf
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
 

Recently uploaded

Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Recently uploaded (20)

Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 

Online Bitrate ladder prediction for Adaptive VVC Streaming

  • 1. Online Bitrate ladder prediction for Adaptive VVC Streaming Vignesh V Menon Postdoctoral Researcher Fraunhofer HHI #SEGMENTS2024 CONTENT Organized by FEBRUARY 14, 2024 | DENVER MARRIOTT TECH CENTER
  • 2. Organized by Introduction to per-title encoding #SEGMENTS2024
  • 3. Introduction to per-title encoding 3 CONTENT [1] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP –: Standards and Design Principles,” in Proceedings of the Second Annual ACM Conference on Multimedia Systems, 2011, p. 133–144. Context- HTTP Adaptive Streaming (HAS) HAS divides the video content into segments and encodes each segment at various representations, stored in plain HTTP servers, which continuously adapt the video delivery to the network conditions and device capabilities of the client [1]. Figure: HAS [1] concept.
  • 4. Introduction to per-title encoding 4 CONTENT [2] B. Bross, Y. Wang, Y. Ye, S. Liu, J. Chen, G. Sullivan, and J. Ohm. (2021). Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology. 31. 3736-3764. 10.1109/TCSVT.2021.3101953. [3] R. Kaafarani et al., “Evaluation Of Bitrate Ladders For Versatile Video Coder,” in 2021 International Conference on Visual Communications and Image Processing (VCIP), 2021, pp. 1–5. [4] A. Aaron, Z. Li, M. Manohara, J.D. Cock, D. Ronca, "Per-title encode optimization." The Netflix Techblog (2015). Motivation for online convex-hull estimation ● The convex hull is where the encoding point achieves “Pareto efficiency”. ● Online convex-hull estimation methods provide a dynamic and adaptive means to optimize bitrate and resolution selections. ● By dynamically adjusting the bitrate-resolution pairs in response to the video content complexity and coding algorithms, these methods achieve an optimal trade-off between computational efficiency and visual fidelity in the face of the increased intricacies associated with advanced codecs like VVC [2,3]. Figure: Conceptual plot to depict the bitrate-quality relationship for any video source encoded at different resolutions. Source: [4]
  • 5. Introduction to per-title encoding 5 CONTENT Figure: Rate-distortion (RD) and rate-encoding time curves of representative sequences (segments) of Inter-4K dataset encoded at 540p, 1080p and 2160p resolutions using VVenC at faster preset. Here, XPSNR is used as the quality metric. Quality versus encoding time trade-off ● Dynamic resolution encoding is the most extensively studied per-title encoding scheme in adaptive streaming applications, focusing on adjusting encoding resolutions dynamically to optimize video quality [5, 6, 7, 8]. ● Encoding time depends on the encoding resolution chosen for the video content. The number of pixels in each frame significantly impacts the computational workload. [5] J. Cock, Zhi Li, M. Manohara, and A. Aaron, “Complexity-based consistent-quality encoding in the cloud,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 1484–1488. [6] A. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in 2019 Picture Coding Symposium (PCS), 2019, pp. 1–5. [7] M. Bhat, J. Thiesse, and P. Le Callet, “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study,” in 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 2164–2168. [8] A. Zabrovskiy, P. Agrawal, C. Timmerer, and R. Prodan, “FAUST: Fast Per-Scene Encoding Using Entropy-Based Scene Detection and Machine Learning,” in 2021 30th Conference of Open Innovations Association FRUCT, 2021, pp. 292–302.
  • 6. Introduction to per-title encoding 6 CONTENT ● Reducing encoding energy consumption (in data centers) is critical in streaming applications since it contributes to environmental sustainability [12]. ● The streaming industry can reduce its carbon footprint and energy consumption by minimizing encoding time. ● Our prior experiments suggest a pseudo linear relationship between encoding time and encoding energy consumption. Reducing encoding time- “greenifies” streaming Figure: Average encoding metrics for 7.5 fps, 15 fps, 24 fps, and 30 fps HLS CBR encoding using the veryslow preset of the x264 [9] AVC [10] encoder. Source: [11] Figure: Average encoding metrics for HLS CBR encoding at 30 fps using selected presets of x264 [9] AVC [10] encoder. Source: [11] [9] VideoLAN, “x264.” [Online]. Available: https://www.videolan.org/developers/x264.html [10] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, 2003, pp. 560–576. [11] Vignesh V. Menon, Samira Afzal, Prajit T. Rajendran, Klaus Schoeffmann, Radu Prodan, and Christian Timmerer. [n. d.]. Content-Adaptive Variable Framerate Encoding Scheme for Green Live Streaming, [Online]. Available: http://arxiv.org/abs/2311.08074 [12] A. Stephens, C. Tremlett-Williams, L. Fitzpatrick, L. Acerini, M. Anderson, and N. Crabbendam, “The Carbon Impacts of Video Streaming,” Jun. 2021.
  • 7. Organized by XPSNR or VMAF in VVC? #SEGMENTS2024
  • 8. XPSNR or VMAF in VVC? 8 CONTENT [13] C. R. Helmrich et al., “Information on and analysis of the VVC encoders in the SDR UHD verification test,” in WG 05 MPEG Joint Video Coding Team(s) with ITU-T SG 16, document JVET-T0103, Oct. 2020. [14] M. Wien and V. Baroncini, “Report on VVC compression performance verification testing in the SDR UHD Random Access Category,” in WG 05 MPEG Joint Video Coding Team(s) with ITU-T SG 16, document JVET-T0097, Oct. 2020. [15] C. R. Helmrich et al., “A study of the extended perceptually weighted peak signal-to-noise ratio (XPSNR) for video compression with different resolutions and bit depths,” in ITU Journal: ICT Discoveries, vol. 3, May 2020. [Online] http://handle.itu.int/11.1002/pub/8153d78b-en ● Traditional convex-hull estimation methods use Video Multimethod Assessment Fusion (VMAF) as the perceptual quality metric. ● However, as observed in [13], Peak Signal to Noise Ratio (PSNR) and VMAF measures fail to model the subjective quality of VVC-coded bitstreams accurately. ● It was observed that Structural Similarity Index (SSIM), Multi-Scale Structural Similarity Index (MS-SSIM), and eXtended Peak Signal-to-Noise Ratio (XPSNR) can predict the subjective codec ranking reported in [14] with acceptable accuracy [15]. Table: – Evaluation results for Spearman rank order correlation with MOS values. Source: [15]
  • 9. Organized by Online Convex-hull using XPSNR #SEGMENTS2024
  • 10. Online Convex-hull using XPSNR (VEXUS) 10 CONTENT
  • 11. Spatiotemporal complexity feature extraction 11 CONTENT We use seven DCT-energy-based features extracted using Video Complexity Analyzer (VCA) [16]: ● average texture energy (EY), ● average gradient of the luma texture energy (h) ● average luma brightness (LY), ● average chroma texture energy of U and V channels (EU and EV) ● average chroma brightness of U and V channels (LU and LV) [17]. [16] V. V. Menon, C. Feldmann, K. Schoeffmann, M. Ghanbari, and C. Timmerer, “Green Video Complexity Analysis for Efficient Encoding in Adaptive Video Streaming,” in First International ACM Green Multimedia Systems Workshop (GMSys ’23), 2023. [17] V. V. Menon, P. T. Rajendran, C. Feldmann, K. Schoeffmann, M. Ghanbari and C. Timmerer, "JND-Aware Two-Pass Per-Title Encoding Scheme for Adaptive Live Streaming," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 1281-1294, Feb. 2024, doi: 10.1109/TCSVT.2023.3290725.
  • 12. XPSNR-optimized resolution prediction 12 CONTENT The objective of selecting the optimized resolution based on bitrate and video complexity features is decomposed into two parts: Modeling: XPSNR of a video scene encoded at resolution r and bitrate b, i.e., x(r,b) is modeled as a function of video complexity features, b, and normalized resolution height r‘ = r/2160 Resolution optimization: Select the resolution that maximizes the predicted XPSNR. We trained XGBoost [18] model hyperparameter tuned to predict XPSNR. ● max_depth=10, and n_estimators=400 [18] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016, pp. 785–794.
  • 13. Optimized QP prediction 13 CONTENT Modeling: The optimized QP is modeled as a function of spatiotemporal features, target bitrate b, and normalized resolution height r‘ as: ● For applications such as streaming, avoiding exceeding the maximum bitrates specified in the HLS/DASH manifests [19, 20] during the encoding process is essential. ● Failure to adhere to these limits can lead to buffer overflows or underflows in video players. [19] I. Sodagar, “The MPEG-DASH Standard for Multimedia Streaming Over the Internet,” IEEE MultiMedia, vol. 18, no. 4, pp. 62–67, 2011. [20] A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann, “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP,” vol, 21, no. 1, IEEE Communications Surveys Tutorials, pp. 562–585, 2019. QP optimization: The optimization function aims to predict the QP, minimizing the discrepancy between the predicted and target bitrate for a given resolution.
  • 14. Optimized QP estimation 14 CONTENT Cascaded approach ● This method involves training distinct XGboost regression models for minimum and maximum QP values (qmin and qmax, respectively). ● The optimized for a target bitrate b is determined using linear regression, as follows: ● The equation captures the non-linear relationship between bitrate and QP by employing a logarithmic mapping of the bitrate values. ● VVenC implemented capped VBR ratecontrol in Jan 2024 release [21], the QP is specified using the qp option, while the maxrate (easy mode) or MaxBitrate (expert mode) option is used to specify the upper bound of bitrate variability. Figure: QP versus normalized bitrate (in log scale) for a representative video segment. [21] C. Helmrich, V. George, V. V. Menon, A. Wieckowski, B. Bross, and D. Marpe, “Fast constant-quality video encoding using VVenC with rate capping based on pre-analysis statistics”, 2024.
  • 16. Experimental design 16 CONTENT Dataset generation Figure: Calculation of the groundtruth PSNR, XPSNR, and bitrate to train the prediction models. This example shows the ground truth calculation of a video encoded at 1080p with qp 30. ● We used 1000 videos of the Inter-4K dataset [22] to validate the performance of the encoding methods. ● We encoded the sequences at UHD (2160p) 60fps using VVenC v1.10 [23] using preset 0 (faster). ● We extracted the spatiotemporal features using VCA v2.0. ● We ran constant quality encoding by varying qp values from qmin to qmax for each resolution. ● We computed full-reference PSNR and XPSNR quality metrics after the compressed video was upscaled to the original resolution (2160p). [22] A. Stergiou and R. Poppe, “AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling,” in IEEE Transactions on Image Processing, vol. 32, 2023, pp. 251–266. [23] A. Więckowski, J. Brandenburg, T. Hinz, C. Bartnik, V. George, G. Hege, C. Helmrich, A. Henkel, C. Lehmann, C. Stoffers, I. Zupancic, B. Bross, and D. Marpe, “VVenC: An Open And Optimized VVC Encoder Implementation,” in Proc. IEEE International Conference on Multimedia Expo Workshops (ICMEW), pp. 1–2.
  • 17. Experimental design 17 CONTENT 1. Default: This method employs a fixed resolution encoding, i.e., all bitstreams are encoded at the exact resolution as the input video. 2. FixedLadder: This method employs a fixed set of bitrate-resolution pairs. We use the HLS bitrate ladder specified in the Apple authoring specifications [24] as the fixed set of bitrate-resolution pairs. 3. Bruteforce: This method determines optimized resolution, which yields the highest XPSNR for a given target bitrate after an exhaustive encoding process at all supported resolutions and QPs. Benchmarks Table: An example fixed bitrate-ladder, i.e., set of bitrate-resolution pairs. Source: [24]. [24] Apple Inc., “HLS Authoring Specification for Apple Devices.” [Online]. Available: https://developer.apple.com/documentation/http-live-streaming/hls-authoring-specification-for-apple-devices Table: Experimental parameters used for evaluation.
  • 19. Speed and accuracy 19 CONTENT ● Speed of feature extraction: 176 fps ● XPSNR prediction ○ MAE: 0.17 dB, R2: 0.99 ○ Std. dev: 0.22 dB ● QP prediction ○ MAE: 1.32, R2: 0.97 ○ Std. dev: 1.96 Figure: Prediction results of XPSNR and QP prediction models.
  • 20. Rate-distortion results 20 CONTENT RD curve of VEXUS closely mirrors the Bruteforce method, indicating the effectiveness of its predictive modeling in approximating optimized resolutions and QPs. Figure: RD curves of representative video sequences using Default (green line), FixedLadder (blue line), Bruteforce (black line), and VEXUS (red line) encodings.
  • 21. Encoding and decoding times 21 CONTENT ● Encoding and decoding times are reduced for lower bitrates, as encoding and decoding operations become less complex due to lower resolutions. Figure: Encoding and decoding times of representative video sequences using Default (green line), FixedLadder (blue line), Bruteforce (black line), and VEXUS (red line) encodings.
  • 22. Result summary 22 CONTENT ● Coding efficiency (in terms of Bjøntegaard Delta [25] rates), encoding and decoding times decrease as rmax decreases. ● The trade-off between quality and coding efficiency is based on the target audience, delivery platform, and available resources. [25] HSTP-VID-WPOM, “Working practices using objective metrics for evaluation of video coding efficiency experiments,” International Telecommunication Union, 2020. [Online]. Available: http://handle.itu.int/11.1002/pub/8160e8da-en Table: Average results of the encoding schemes compared to Default encoding.
  • 23. Organized by Encoding time constrained convex-hull estimation #SEGMENTS2024
  • 24. Encoding time constrained convex-hull estimation 24 CONTENT Resolution optimization: ● Select the resolution that maximizes the predicted XPSNR, constrained on the maximum encoding time [26]. ● Encoding time is predicted using the same approach as the QP prediction. ● Encoding time constraint is effectively a constraint on the encoding energy. [26] V. V. Menon, A. Premkumar, P. T. Rajendran, A. Więckowski, B. Bross, C. Timmerer, and D. Marpe, "Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resolution Encoding." 2024 Mile High Video (MHV), doi: 10.1145/3638036.3640801.
  • 25. Encoding time constrained convex-hull estimation 25 CONTENT Figure: RD curves, and encoding times of representative video sequences (segments) using VEXUS (𝑟max = 2160). Source: [27] Results RD performance decreases as the encoding time constraint is lowered. Figure: Selected encoding resolutions of representative video sequences (segments) using VEXUS (𝑟max = 2160). Source: [27] [27] A. Premkumar, P. T. Rajendran, V. V. Menon, A. Więckowski, B. Bross, and D. Marpe, "Quality-Aware Dynamic Resolution Adaptation Framework for Adaptive Video Streaming,” 2024 [Online]. Available: https://github.com/PhoenixVideo/QADRA
  • 27. Summary 27 CONTENT ● XPSNR demonstrates a better correlation with subjective quality scores for VVC-coded UHD content. ● Leveraging this insight, we introduced an approach where XPSNR is predicted for VVC-coded bitstreams using spatiotemporal complexity features of the video and the target encoding configuration. ● We proposed VEXUS, where the convex-hull is estimated online using the predicted XPSNR. ● On average, VEXUS yields a substantial improvement of 5.84 dB in PSNR and 0.62 dB in XPSNR for the same bitrates compared to the conventional UHD encoding with the VVenC encoder, followed by a 44.43% reduction in overall encoding time, and a 65.46% reduction in overall decoding time using VTM decoder. ● We also discussed introducing an encoding time constraint in the convex-hull estimation process, and analyzed its impact on RD performance.
  • 29. Open-source tools 29 CONTENT 1. VVC encoder: Fraunhofer Versatile Video Encoder (VVenC) v1.10 Available: https://github.com/fraunhoferhhi/vvenc 2. VVC decoder: VTM reference decoder v22.0 Available: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM 3. Spatiotemporal feature extractor: Video Complexity Analyzer (VCA) v2.0 Available: https://github.com/cd-athena/VCA 4. Convex-hull estimation framework: Available: https://github.com/PhoenixVideo/QADRA
  • 30. Organized by Questions? Get in touch. Vignesh V Menon vignesh.menon@hhi.fraunhofer.de #SEGMENTS2024