SlideShare a Scribd company logo
1 of 18
Download to read offline
Content-adaptive Encoder Preset Prediction for Adaptive Live
Streaming
Vignesh V Menon1, Hadi Amirpour1, Prajit T Rajendran2, Mohammad Ghanbari1,3, and
Christian Timmerer1
1
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria
2
Universite Paris-Saclay, CEA, List, F-91120, Palaiseau, France
3
School of Computer Science and Electronic Engineering, University of Essex, UK
9 December 2022
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 1
Outline
1 Introduction
2 Research Problem
3 Content-Adaptive Encoder Preset Prediction Scheme (CAPS)
4 Evaluation
5 Conclusions
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 2
Introduction
Introduction
HTTP Adaptive Streaming (HAS)
HTTP Adaptive Streaming (HAS)1 has become the de-facto standard in delivering video
content for various clients regarding internet speeds and device types.
Traditionally, a fixed bitrate ladder, e.g., HTTP Live Streaming (HLS) bitrate ladder2, is
used in live streaming.
1
A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019),
pp. 562–585. doi: 10.1109/COMST.2018.2862938.
2
https://developer.apple.com/documentation/http live streaming/ hls authoring specification for apple devices, last access: Nov 30, 2022.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 3
Introduction
Introduction
Live encoding in HAS
Figure: Encoding time of HLS bitrate ladder2
representations of the Wood s000 sequence
(5 second duration, 24fps) of VCD dataset
using ultrafast preset of x265 and 8 CPU
threads.
For every representation, maintaining a fixed
encoding speed, which is the same as the
video framerate, independent of the video
content, is a key goal for a live encoder.
Reduction in encoding speed may lead to the
unacceptable outcome of dropped frames during
transmission, eventually decreasing the Quality
of Experience (QoE).a
Increase in encoding speed leads to more CPU
idle time!
a
Pradeep Ramachandran et al. “Content Adaptive Live Encoding with Open Source
Codecs”. In: Proceedings of the 11th ACM Multimedia Systems Conference. MMSys
’20. Istanbul, Turkey: Association for Computing Machinery, 2020, 345–348. isbn:
9781450368452. doi: 10.1145/3339825.3393580. url:
https://doi.org/10.1145/3339825.3393580.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 4
Introduction
Introduction
Live encoding in HAS
The preset for the fastest encoding (ultrafast for x2643 and x2654) is used as the encoding
preset for all live content, independent of the dynamic complexity of the content.
The resulting encode is sub-optimal, especially when the type of the content is dynamically
changing, which is the typical use-case for live streams.5
When the content becomes easier to encode, the encoder would achieve a higher encoding
speed than the target encoding speed. This, in turn, introduces unnecessary CPU idle time
as it waits for the video feed.
3
https://www.videolan.org/developers/x264.html, last access: Nov 30, 2022.
4
https://www.videolan.org/developers/x265.html, last access: Nov 30, 2022.
5
Qingxiong Huangyuan et al. “Performance evaluation of H.265/MPEG-HEVC encoders for 4K video sequences”. In: Signal and Information Processing
Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. 2014, pp. 1–8. doi: 10.1109/APSIPA.2014.7041782.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 5
Research Problem
Research Problem
For easy-to-encode content, encoder preset need to be configured such that encoding speed
can be reduced while still being compatible with the expected live encoding speed, improving
the quality of the encoded content.
When the content becomes complex again, the encoder preset need to be reconfigured to
move back to the faster configuration that achieves live encoding speed.6
This paper targets an encoding scheme that determines the encoding preset configuration dy-
namically, which:
is adaptive to the video content.
maximizes the CPU utilization for a given target encoding speed.
maximizes the compression efficiency.
6
Sergey Zvezdakov, Denis Kondranin, and Dmitriy Vatolin. “Machine-Learning-Based Method for Content-Adaptive Video Encoding”. In: 2021 Picture Coding
Symposium (PCS). 2021, pp. 1–5. doi: 10.1109/PCS50896.2021.9477507.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 6
Content-Adaptive Encoder Preset Prediction Scheme (CAPS)
Content-Adaptive encoder Preset prediction Scheme (CAPS)
Bitrate Ladder
Video Segment
Video Complexity
Feature Extraction
E
h
L
Bitrate set (B)
Resolution set (R)
Target speed
Apriori information (e.g., codec, CPU threads)
Encoding Preset
Prediction
Encoder
Encoder
Encoder
Encoder
…
rep1
rep2
repN
repN-1
Figure: The encoding pipeline using CAPS envisioned in this paper.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 7
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction
CAPS
Phase 1: Video Complexity Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame are averaged to determine the energy per frame.7,8
Es =
K−1
X
k=0
Hs,k
K · w2
(2)
7
Michael King, Zinovi Tauber, and Ze-Nian Li. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on
Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
8
Vignesh V Menon et al. “Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming”. In: 2021 IEEE International Conference
on Image Processing (ICIP). 2021, pp. 2174–2178. doi: 10.1109/ICIP42928.2021.9506092.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 8
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction
CAPS
Phase 1: Video Complexity Feature Extraction
hs: SAD of the block level energy values of frame s to that of the previous frame s − 1.
hs =
K−1
X
k=0
| Hs,k, Hs−1,k |
K · w2
(3)
where K denotes the number of blocks in frame s.
The luminescence of non-overlapping blocks k of sth frame is defined as:
Ls,k =
p
DCT(0, 0) (4)
The block-wise luminescence is averaged per frame denoted as Ls as shown below.
Ls =
K−1
X
k=0
Ls,k
K · w2
(5)
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 9
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
E
h
L
Model
set
log(r)
log(b) ̂
𝑡!!"#
.
.
.
̂
𝑡!!$%
Preset
selection
̂
𝑝
Figure: Encoding Preset Prediction architecture.
Model Set: models to predict the encoding time for each preset.
Preset selection: select the optimized preset that yields the target encoding time.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 10
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
Model Set
Random Forest (RF) models are trained to predict the encoding times for the pre-defined
set of encoding presets (P).
The minimum and maximum encoder preset (pmin and pmax , respectively) are chosen based
on the target encoder. For example, x265 HEVC9 encoder supports encoding presets
ranging from 0 to 9 (i.e., ultrafast to placebo).
The model set predicts the encoding times for each of the presets in P as t̂pmin to t̂pmax .
9
G. J. Sullivan et al. “Overview of the high efficiency video coding (HEVC) standard”. In: IEEE Transactions on circuits and systems for video technology
22.12 (2012), pp. 1649–1668.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 11
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
Preset Selection
The preset is selected, which is closest to the target encoding time T, which is defined as:
T =
n
f
(6)
where n represents the number of frames in the segment.
The function ensures that the encoding time is not greater than the target encoding
time T.
t̂ = argminp | T − tp | c.t. p ∈ [pmin, pmax ]; t̂ ≤ T (7)
where p is the selected optimum preset and t̂ is the encoding time for p.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 12
Evaluation Test Methodology
Test Methodology
Dataset: Video Complexity Dataset (VCD)10 (500 Ultra HD sequences, 5 s duration, 24 fps).
Table: Encoder Settings.
Encoder x265 v3.5
Target encoding speed/time 24 fps/ 5 seconds
CPU threads 8
Ratecontrol CBR
Table: Representations considered in this paper2
.
Representation ID 01 02 03 04 05 06 07 08 09 10 11 12
r (height in pixels) 360 432 540 540 540 720 720 1080 1080 1440 2160 2160
b (in Mbps) 0.145 0.300 0.600 0.900 1.600 2.400 3.400 4.500 5.800 8.100 11.600 16.800
E, h, and L features are extracted from the video segments using VCA11 run in eight CPU
threads.
10
Hadi Amirpour et al. “VCD: Video Complexity Dataset”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland:
Association for Computing Machinery, 2022, 234–239. isbn: 9781450392839. doi: 10.1145/3524273.3532892. url:
https://doi.org/10.1145/3524273.3532892.
11
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone,
Ireland: Association for Computing Machinery, 2022, 259–264. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url:
https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 13
Evaluation Experimental Results
Experimental Results
Figure: Average relative importance of features
in the encoding time prediction of 2160p en-
coding for all presets.
Preset 0 1 2 3 4 5 6 7 8
R2 0.97 0.96 0.97 0.97 0.97 0.96 0.97 0.98 0.98
MAE 0.06 0.09 0.12 0.17 0.22 0.31 0.33 0.41 0.49
Table: Average encoding time prediction performance
results of every preset considered for experimental vali-
dation. (p = 0: ultrafast)
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 14
Evaluation Experimental Results
Experimental Results
Figure: Average preset chosen for each representation in the HLS bitrate ladder.
On average, representation 01 (0.145 Mbps) chooses slow preset (p = 6).
On average, representations 11 (11.6 Mbps) and 12 (16.8 Mbps) choose ultrafast preset
(p = 0).
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 15
Evaluation Experimental Results
Experimental Results
(a) (b) (c)
Figure: (a) Average encoding time, (b) Average PSNR, and (c) Average VMAF for each representation.
Using ultrafast preset for all representations introduces significant CPU idle time for lower
bitrate representations. However, CAPS yield lower CPU idle time when the encodings are
carried out concurrently.
Using CAPS, visual quality improves significantly at lower bitrate representations. CAPS
yields an overall BD-VMAF of 3.81 and BD-PSNR of 0.83 dB.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 16
Conclusions
Conclusions
This paper proposed CAPS, a content-adaptive encoder preset prediction scheme for adap-
tive live streaming applications.
CAPS predicts the optimized encoder preset for a given target bitrate, and resolution for
each segment, which helps improve the compression-efficiency of video encodings.
DCT-energy-based features are used to determine segments’ spatial and temporal complex-
ity.
CAPS yield lower idle time and an overall quality improvement of 0.83dB PSNR and 3.81
VMAF score with the same bitrate, compared to the fastest preset (ultrafast) x265 encoding
of the reference HLS bitrate ladder.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 17
Q & A
Q & A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
VCA
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 18

More Related Content

Similar to Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming

CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdfJunZhao68
 
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...Alpen-Adria-Universität
 
Online Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingOnline Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingVignesh V Menon
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVignesh V Menon
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Alpen-Adria-Universität
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdfVignesh V Menon
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...Alpen-Adria-Universität
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCVignesh V Menon
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCAlpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingAlpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfVignesh V Menon
 
Efficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingEfficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingMinh Nguyen
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingPerceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingAlpen-Adria-Universität
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Vignesh V Menon
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfVignesh V Menon
 
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingMiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingAlpen-Adria-Universität
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?Alpen-Adria-Universität
 

Similar to Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming (20)

CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
 
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
 
Online Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingOnline Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC Streaming
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdf
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdf
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
 
HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
 
Efficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingEfficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streaming
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingPerceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
 
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingMiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?
 

More from Vignesh V Menon

Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Vignesh V Menon
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfVignesh V Menon
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live StreamingVignesh V Menon
 
Doctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfDoctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfVignesh V Menon
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfVignesh V Menon
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVignesh V Menon
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingVignesh V Menon
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...Vignesh V Menon
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...Vignesh V Menon
 

More from Vignesh V Menon (10)

Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live Streaming
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
Doctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfDoctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdf
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdf
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
 

Recently uploaded

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Recently uploaded (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 

Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming

  • 1. Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming Vignesh V Menon1, Hadi Amirpour1, Prajit T Rajendran2, Mohammad Ghanbari1,3, and Christian Timmerer1 1 Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria 2 Universite Paris-Saclay, CEA, List, F-91120, Palaiseau, France 3 School of Computer Science and Electronic Engineering, University of Essex, UK 9 December 2022 Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 1
  • 2. Outline 1 Introduction 2 Research Problem 3 Content-Adaptive Encoder Preset Prediction Scheme (CAPS) 4 Evaluation 5 Conclusions Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 2
  • 3. Introduction Introduction HTTP Adaptive Streaming (HAS) HTTP Adaptive Streaming (HAS)1 has become the de-facto standard in delivering video content for various clients regarding internet speeds and device types. Traditionally, a fixed bitrate ladder, e.g., HTTP Live Streaming (HLS) bitrate ladder2, is used in live streaming. 1 A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019), pp. 562–585. doi: 10.1109/COMST.2018.2862938. 2 https://developer.apple.com/documentation/http live streaming/ hls authoring specification for apple devices, last access: Nov 30, 2022. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 3
  • 4. Introduction Introduction Live encoding in HAS Figure: Encoding time of HLS bitrate ladder2 representations of the Wood s000 sequence (5 second duration, 24fps) of VCD dataset using ultrafast preset of x265 and 8 CPU threads. For every representation, maintaining a fixed encoding speed, which is the same as the video framerate, independent of the video content, is a key goal for a live encoder. Reduction in encoding speed may lead to the unacceptable outcome of dropped frames during transmission, eventually decreasing the Quality of Experience (QoE).a Increase in encoding speed leads to more CPU idle time! a Pradeep Ramachandran et al. “Content Adaptive Live Encoding with Open Source Codecs”. In: Proceedings of the 11th ACM Multimedia Systems Conference. MMSys ’20. Istanbul, Turkey: Association for Computing Machinery, 2020, 345–348. isbn: 9781450368452. doi: 10.1145/3339825.3393580. url: https://doi.org/10.1145/3339825.3393580. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 4
  • 5. Introduction Introduction Live encoding in HAS The preset for the fastest encoding (ultrafast for x2643 and x2654) is used as the encoding preset for all live content, independent of the dynamic complexity of the content. The resulting encode is sub-optimal, especially when the type of the content is dynamically changing, which is the typical use-case for live streams.5 When the content becomes easier to encode, the encoder would achieve a higher encoding speed than the target encoding speed. This, in turn, introduces unnecessary CPU idle time as it waits for the video feed. 3 https://www.videolan.org/developers/x264.html, last access: Nov 30, 2022. 4 https://www.videolan.org/developers/x265.html, last access: Nov 30, 2022. 5 Qingxiong Huangyuan et al. “Performance evaluation of H.265/MPEG-HEVC encoders for 4K video sequences”. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. 2014, pp. 1–8. doi: 10.1109/APSIPA.2014.7041782. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 5
  • 6. Research Problem Research Problem For easy-to-encode content, encoder preset need to be configured such that encoding speed can be reduced while still being compatible with the expected live encoding speed, improving the quality of the encoded content. When the content becomes complex again, the encoder preset need to be reconfigured to move back to the faster configuration that achieves live encoding speed.6 This paper targets an encoding scheme that determines the encoding preset configuration dy- namically, which: is adaptive to the video content. maximizes the CPU utilization for a given target encoding speed. maximizes the compression efficiency. 6 Sergey Zvezdakov, Denis Kondranin, and Dmitriy Vatolin. “Machine-Learning-Based Method for Content-Adaptive Video Encoding”. In: 2021 Picture Coding Symposium (PCS). 2021, pp. 1–5. doi: 10.1109/PCS50896.2021.9477507. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 6
  • 7. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Content-Adaptive encoder Preset prediction Scheme (CAPS) Bitrate Ladder Video Segment Video Complexity Feature Extraction E h L Bitrate set (B) Resolution set (R) Target speed Apriori information (e.g., codec, CPU threads) Encoding Preset Prediction Encoder Encoder Encoder Encoder … rep1 rep2 repN repN-1 Figure: The encoding pipeline using CAPS envisioned in this paper. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 7
  • 8. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction CAPS Phase 1: Video Complexity Feature Extraction Compute texture energy per block A DCT-based energy function is used to determine the block-wise feature of each frame defined as: Hk = w−1 X i=0 w−1 X j=0 e|( ij wh )2−1| |DCT(i, j)| (1) where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i + j > 0, and 0 otherwise. The energy values of blocks in a frame are averaged to determine the energy per frame.7,8 Es = K−1 X k=0 Hs,k K · w2 (2) 7 Michael King, Zinovi Tauber, and Ze-Nian Li. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983. 8 Vignesh V Menon et al. “Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2174–2178. doi: 10.1109/ICIP42928.2021.9506092. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 8
  • 9. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction CAPS Phase 1: Video Complexity Feature Extraction hs: SAD of the block level energy values of frame s to that of the previous frame s − 1. hs = K−1 X k=0 | Hs,k, Hs−1,k | K · w2 (3) where K denotes the number of blocks in frame s. The luminescence of non-overlapping blocks k of sth frame is defined as: Ls,k = p DCT(0, 0) (4) The block-wise luminescence is averaged per frame denoted as Ls as shown below. Ls = K−1 X k=0 Ls,k K · w2 (5) Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 9
  • 10. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction E h L Model set log(r) log(b) ̂ 𝑡!!"# . . . ̂ 𝑡!!$% Preset selection ̂ 𝑝 Figure: Encoding Preset Prediction architecture. Model Set: models to predict the encoding time for each preset. Preset selection: select the optimized preset that yields the target encoding time. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 10
  • 11. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction Model Set Random Forest (RF) models are trained to predict the encoding times for the pre-defined set of encoding presets (P). The minimum and maximum encoder preset (pmin and pmax , respectively) are chosen based on the target encoder. For example, x265 HEVC9 encoder supports encoding presets ranging from 0 to 9 (i.e., ultrafast to placebo). The model set predicts the encoding times for each of the presets in P as t̂pmin to t̂pmax . 9 G. J. Sullivan et al. “Overview of the high efficiency video coding (HEVC) standard”. In: IEEE Transactions on circuits and systems for video technology 22.12 (2012), pp. 1649–1668. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 11
  • 12. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction Preset Selection The preset is selected, which is closest to the target encoding time T, which is defined as: T = n f (6) where n represents the number of frames in the segment. The function ensures that the encoding time is not greater than the target encoding time T. t̂ = argminp | T − tp | c.t. p ∈ [pmin, pmax ]; t̂ ≤ T (7) where p is the selected optimum preset and t̂ is the encoding time for p. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 12
  • 13. Evaluation Test Methodology Test Methodology Dataset: Video Complexity Dataset (VCD)10 (500 Ultra HD sequences, 5 s duration, 24 fps). Table: Encoder Settings. Encoder x265 v3.5 Target encoding speed/time 24 fps/ 5 seconds CPU threads 8 Ratecontrol CBR Table: Representations considered in this paper2 . Representation ID 01 02 03 04 05 06 07 08 09 10 11 12 r (height in pixels) 360 432 540 540 540 720 720 1080 1080 1440 2160 2160 b (in Mbps) 0.145 0.300 0.600 0.900 1.600 2.400 3.400 4.500 5.800 8.100 11.600 16.800 E, h, and L features are extracted from the video segments using VCA11 run in eight CPU threads. 10 Hadi Amirpour et al. “VCD: Video Complexity Dataset”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland: Association for Computing Machinery, 2022, 234–239. isbn: 9781450392839. doi: 10.1145/3524273.3532892. url: https://doi.org/10.1145/3524273.3532892. 11 Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland: Association for Computing Machinery, 2022, 259–264. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 13
  • 14. Evaluation Experimental Results Experimental Results Figure: Average relative importance of features in the encoding time prediction of 2160p en- coding for all presets. Preset 0 1 2 3 4 5 6 7 8 R2 0.97 0.96 0.97 0.97 0.97 0.96 0.97 0.98 0.98 MAE 0.06 0.09 0.12 0.17 0.22 0.31 0.33 0.41 0.49 Table: Average encoding time prediction performance results of every preset considered for experimental vali- dation. (p = 0: ultrafast) Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 14
  • 15. Evaluation Experimental Results Experimental Results Figure: Average preset chosen for each representation in the HLS bitrate ladder. On average, representation 01 (0.145 Mbps) chooses slow preset (p = 6). On average, representations 11 (11.6 Mbps) and 12 (16.8 Mbps) choose ultrafast preset (p = 0). Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 15
  • 16. Evaluation Experimental Results Experimental Results (a) (b) (c) Figure: (a) Average encoding time, (b) Average PSNR, and (c) Average VMAF for each representation. Using ultrafast preset for all representations introduces significant CPU idle time for lower bitrate representations. However, CAPS yield lower CPU idle time when the encodings are carried out concurrently. Using CAPS, visual quality improves significantly at lower bitrate representations. CAPS yields an overall BD-VMAF of 3.81 and BD-PSNR of 0.83 dB. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 16
  • 17. Conclusions Conclusions This paper proposed CAPS, a content-adaptive encoder preset prediction scheme for adap- tive live streaming applications. CAPS predicts the optimized encoder preset for a given target bitrate, and resolution for each segment, which helps improve the compression-efficiency of video encodings. DCT-energy-based features are used to determine segments’ spatial and temporal complex- ity. CAPS yield lower idle time and an overall quality improvement of 0.83dB PSNR and 3.81 VMAF score with the same bitrate, compared to the fastest preset (ultrafast) x265 encoding of the reference HLS bitrate ladder. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 17
  • 18. Q & A Q & A Thank you for your attention! Vignesh V Menon (vignesh.menon@aau.at) VCA Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 18