SlideShare a Scribd company logo
Content-adaptive Encoder Preset Prediction for Adaptive Live
Streaming
Vignesh V Menon1, Hadi Amirpour1, Prajit T Rajendran2, Mohammad Ghanbari1,3, and
Christian Timmerer1
1
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria
2
Universite Paris-Saclay, CEA, List, F-91120, Palaiseau, France
3
School of Computer Science and Electronic Engineering, University of Essex, UK
9 December 2022
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 1
Outline
1 Introduction
2 Research Problem
3 Content-Adaptive Encoder Preset Prediction Scheme (CAPS)
4 Evaluation
5 Conclusions
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 2
Introduction
Introduction
HTTP Adaptive Streaming (HAS)
HTTP Adaptive Streaming (HAS)1 has become the de-facto standard in delivering video
content for various clients regarding internet speeds and device types.
Traditionally, a fixed bitrate ladder, e.g., HTTP Live Streaming (HLS) bitrate ladder2, is
used in live streaming.
1
A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019),
pp. 562–585. doi: 10.1109/COMST.2018.2862938.
2
https://developer.apple.com/documentation/http live streaming/ hls authoring specification for apple devices, last access: Nov 30, 2022.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 3
Introduction
Introduction
Live encoding in HAS
Figure: Encoding time of HLS bitrate ladder2
representations of the Wood s000 sequence
(5 second duration, 24fps) of VCD dataset
using ultrafast preset of x265 and 8 CPU
threads.
For every representation, maintaining a fixed
encoding speed, which is the same as the
video framerate, independent of the video
content, is a key goal for a live encoder.
Reduction in encoding speed may lead to the
unacceptable outcome of dropped frames during
transmission, eventually decreasing the Quality
of Experience (QoE).a
Increase in encoding speed leads to more CPU
idle time!
a
Pradeep Ramachandran et al. “Content Adaptive Live Encoding with Open Source
Codecs”. In: Proceedings of the 11th ACM Multimedia Systems Conference. MMSys
’20. Istanbul, Turkey: Association for Computing Machinery, 2020, 345–348. isbn:
9781450368452. doi: 10.1145/3339825.3393580. url:
https://doi.org/10.1145/3339825.3393580.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 4
Introduction
Introduction
Live encoding in HAS
The preset for the fastest encoding (ultrafast for x2643 and x2654) is used as the encoding
preset for all live content, independent of the dynamic complexity of the content.
The resulting encode is sub-optimal, especially when the type of the content is dynamically
changing, which is the typical use-case for live streams.5
When the content becomes easier to encode, the encoder would achieve a higher encoding
speed than the target encoding speed. This, in turn, introduces unnecessary CPU idle time
as it waits for the video feed.
3
https://www.videolan.org/developers/x264.html, last access: Nov 30, 2022.
4
https://www.videolan.org/developers/x265.html, last access: Nov 30, 2022.
5
Qingxiong Huangyuan et al. “Performance evaluation of H.265/MPEG-HEVC encoders for 4K video sequences”. In: Signal and Information Processing
Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. 2014, pp. 1–8. doi: 10.1109/APSIPA.2014.7041782.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 5
Research Problem
Research Problem
For easy-to-encode content, encoder preset need to be configured such that encoding speed
can be reduced while still being compatible with the expected live encoding speed, improving
the quality of the encoded content.
When the content becomes complex again, the encoder preset need to be reconfigured to
move back to the faster configuration that achieves live encoding speed.6
This paper targets an encoding scheme that determines the encoding preset configuration dy-
namically, which:
is adaptive to the video content.
maximizes the CPU utilization for a given target encoding speed.
maximizes the compression efficiency.
6
Sergey Zvezdakov, Denis Kondranin, and Dmitriy Vatolin. “Machine-Learning-Based Method for Content-Adaptive Video Encoding”. In: 2021 Picture Coding
Symposium (PCS). 2021, pp. 1–5. doi: 10.1109/PCS50896.2021.9477507.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 6
Content-Adaptive Encoder Preset Prediction Scheme (CAPS)
Content-Adaptive encoder Preset prediction Scheme (CAPS)
Bitrate Ladder
Video Segment
Video Complexity
Feature Extraction
E
h
L
Bitrate set (B)
Resolution set (R)
Target speed
Apriori information (e.g., codec, CPU threads)
Encoding Preset
Prediction
Encoder
Encoder
Encoder
Encoder
…
rep1
rep2
repN
repN-1
Figure: The encoding pipeline using CAPS envisioned in this paper.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 7
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction
CAPS
Phase 1: Video Complexity Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame are averaged to determine the energy per frame.7,8
Es =
K−1
X
k=0
Hs,k
K · w2
(2)
7
Michael King, Zinovi Tauber, and Ze-Nian Li. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on
Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
8
Vignesh V Menon et al. “Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming”. In: 2021 IEEE International Conference
on Image Processing (ICIP). 2021, pp. 2174–2178. doi: 10.1109/ICIP42928.2021.9506092.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 8
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction
CAPS
Phase 1: Video Complexity Feature Extraction
hs: SAD of the block level energy values of frame s to that of the previous frame s − 1.
hs =
K−1
X
k=0
| Hs,k, Hs−1,k |
K · w2
(3)
where K denotes the number of blocks in frame s.
The luminescence of non-overlapping blocks k of sth frame is defined as:
Ls,k =
p
DCT(0, 0) (4)
The block-wise luminescence is averaged per frame denoted as Ls as shown below.
Ls =
K−1
X
k=0
Ls,k
K · w2
(5)
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 9
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
E
h
L
Model
set
log(r)
log(b) ̂
𝑡!!"#
.
.
.
̂
𝑡!!$%
Preset
selection
̂
𝑝
Figure: Encoding Preset Prediction architecture.
Model Set: models to predict the encoding time for each preset.
Preset selection: select the optimized preset that yields the target encoding time.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 10
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
Model Set
Random Forest (RF) models are trained to predict the encoding times for the pre-defined
set of encoding presets (P).
The minimum and maximum encoder preset (pmin and pmax , respectively) are chosen based
on the target encoder. For example, x265 HEVC9 encoder supports encoding presets
ranging from 0 to 9 (i.e., ultrafast to placebo).
The model set predicts the encoding times for each of the presets in P as t̂pmin to t̂pmax .
9
G. J. Sullivan et al. “Overview of the high efficiency video coding (HEVC) standard”. In: IEEE Transactions on circuits and systems for video technology
22.12 (2012), pp. 1649–1668.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 11
Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction
CAPS
Phase 2: Encoding Preset Prediction
Preset Selection
The preset is selected, which is closest to the target encoding time T, which is defined as:
T =
n
f
(6)
where n represents the number of frames in the segment.
The function ensures that the encoding time is not greater than the target encoding
time T.
t̂ = argminp | T − tp | c.t. p ∈ [pmin, pmax ]; t̂ ≤ T (7)
where p is the selected optimum preset and t̂ is the encoding time for p.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 12
Evaluation Test Methodology
Test Methodology
Dataset: Video Complexity Dataset (VCD)10 (500 Ultra HD sequences, 5 s duration, 24 fps).
Table: Encoder Settings.
Encoder x265 v3.5
Target encoding speed/time 24 fps/ 5 seconds
CPU threads 8
Ratecontrol CBR
Table: Representations considered in this paper2
.
Representation ID 01 02 03 04 05 06 07 08 09 10 11 12
r (height in pixels) 360 432 540 540 540 720 720 1080 1080 1440 2160 2160
b (in Mbps) 0.145 0.300 0.600 0.900 1.600 2.400 3.400 4.500 5.800 8.100 11.600 16.800
E, h, and L features are extracted from the video segments using VCA11 run in eight CPU
threads.
10
Hadi Amirpour et al. “VCD: Video Complexity Dataset”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland:
Association for Computing Machinery, 2022, 234–239. isbn: 9781450392839. doi: 10.1145/3524273.3532892. url:
https://doi.org/10.1145/3524273.3532892.
11
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone,
Ireland: Association for Computing Machinery, 2022, 259–264. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url:
https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 13
Evaluation Experimental Results
Experimental Results
Figure: Average relative importance of features
in the encoding time prediction of 2160p en-
coding for all presets.
Preset 0 1 2 3 4 5 6 7 8
R2 0.97 0.96 0.97 0.97 0.97 0.96 0.97 0.98 0.98
MAE 0.06 0.09 0.12 0.17 0.22 0.31 0.33 0.41 0.49
Table: Average encoding time prediction performance
results of every preset considered for experimental vali-
dation. (p = 0: ultrafast)
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 14
Evaluation Experimental Results
Experimental Results
Figure: Average preset chosen for each representation in the HLS bitrate ladder.
On average, representation 01 (0.145 Mbps) chooses slow preset (p = 6).
On average, representations 11 (11.6 Mbps) and 12 (16.8 Mbps) choose ultrafast preset
(p = 0).
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 15
Evaluation Experimental Results
Experimental Results
(a) (b) (c)
Figure: (a) Average encoding time, (b) Average PSNR, and (c) Average VMAF for each representation.
Using ultrafast preset for all representations introduces significant CPU idle time for lower
bitrate representations. However, CAPS yield lower CPU idle time when the encodings are
carried out concurrently.
Using CAPS, visual quality improves significantly at lower bitrate representations. CAPS
yields an overall BD-VMAF of 3.81 and BD-PSNR of 0.83 dB.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 16
Conclusions
Conclusions
This paper proposed CAPS, a content-adaptive encoder preset prediction scheme for adap-
tive live streaming applications.
CAPS predicts the optimized encoder preset for a given target bitrate, and resolution for
each segment, which helps improve the compression-efficiency of video encodings.
DCT-energy-based features are used to determine segments’ spatial and temporal complex-
ity.
CAPS yield lower idle time and an overall quality improvement of 0.83dB PSNR and 3.81
VMAF score with the same bitrate, compared to the fastest preset (ultrafast) x265 encoding
of the reference HLS bitrate ladder.
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 17
Q & A
Q & A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
VCA
Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 18

More Related Content

Similar to CAPS_Presentation.pdf

CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
JunZhao68
 
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
Alpen-Adria-Universität
 
Online Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingOnline Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC Streaming
Vignesh V Menon
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdf
Vignesh V Menon
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Alpen-Adria-Universität
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdf
Vignesh V Menon
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
Alpen-Adria-Universität
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
Vignesh V Menon
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
Alpen-Adria-Universität
 
HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?
Alpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
Alpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
Vignesh V Menon
 
Efficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingEfficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streaming
Minh Nguyen
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
Videoguy
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingPerceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Alpen-Adria-Universität
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Alpen-Adria-Universität
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Vignesh V Menon
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Vignesh V Menon
 
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingMiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
Alpen-Adria-Universität
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?
Alpen-Adria-Universität
 

Similar to CAPS_Presentation.pdf (20)

CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
 
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...
 
Online Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingOnline Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC Streaming
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdf
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
 
Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdf
 
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation...
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
 
HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?HTTP Adaptive Streaming – Quo Vadis?
HTTP Adaptive Streaming – Quo Vadis?
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
 
Efficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingEfficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streaming
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingPerceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video Streaming
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
 
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
 
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingMiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
 
HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?HTTP Adaptive Streaming – Where Is It Heading?
HTTP Adaptive Streaming – Where Is It Heading?
 

More from Vignesh V Menon

Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Vignesh V Menon
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Vignesh V Menon
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live Streaming
Vignesh V Menon
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
Vignesh V Menon
 
Doctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfDoctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdf
Vignesh V Menon
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
Vignesh V Menon
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdf
Vignesh V Menon
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Vignesh V Menon
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
Vignesh V Menon
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
Vignesh V Menon
 

More from Vignesh V Menon (10)

Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live Streaming
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
Doctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfDoctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdf
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdf
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
 

Recently uploaded

What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
ssuser13ffe4
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
deepaannamalai16
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
JomonJoseph58
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Vivekanand Anglo Vedic Academy
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 

Recently uploaded (20)

What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 

CAPS_Presentation.pdf

  • 1. Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming Vignesh V Menon1, Hadi Amirpour1, Prajit T Rajendran2, Mohammad Ghanbari1,3, and Christian Timmerer1 1 Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria 2 Universite Paris-Saclay, CEA, List, F-91120, Palaiseau, France 3 School of Computer Science and Electronic Engineering, University of Essex, UK 9 December 2022 Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 1
  • 2. Outline 1 Introduction 2 Research Problem 3 Content-Adaptive Encoder Preset Prediction Scheme (CAPS) 4 Evaluation 5 Conclusions Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 2
  • 3. Introduction Introduction HTTP Adaptive Streaming (HAS) HTTP Adaptive Streaming (HAS)1 has become the de-facto standard in delivering video content for various clients regarding internet speeds and device types. Traditionally, a fixed bitrate ladder, e.g., HTTP Live Streaming (HLS) bitrate ladder2, is used in live streaming. 1 A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019), pp. 562–585. doi: 10.1109/COMST.2018.2862938. 2 https://developer.apple.com/documentation/http live streaming/ hls authoring specification for apple devices, last access: Nov 30, 2022. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 3
  • 4. Introduction Introduction Live encoding in HAS Figure: Encoding time of HLS bitrate ladder2 representations of the Wood s000 sequence (5 second duration, 24fps) of VCD dataset using ultrafast preset of x265 and 8 CPU threads. For every representation, maintaining a fixed encoding speed, which is the same as the video framerate, independent of the video content, is a key goal for a live encoder. Reduction in encoding speed may lead to the unacceptable outcome of dropped frames during transmission, eventually decreasing the Quality of Experience (QoE).a Increase in encoding speed leads to more CPU idle time! a Pradeep Ramachandran et al. “Content Adaptive Live Encoding with Open Source Codecs”. In: Proceedings of the 11th ACM Multimedia Systems Conference. MMSys ’20. Istanbul, Turkey: Association for Computing Machinery, 2020, 345–348. isbn: 9781450368452. doi: 10.1145/3339825.3393580. url: https://doi.org/10.1145/3339825.3393580. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 4
  • 5. Introduction Introduction Live encoding in HAS The preset for the fastest encoding (ultrafast for x2643 and x2654) is used as the encoding preset for all live content, independent of the dynamic complexity of the content. The resulting encode is sub-optimal, especially when the type of the content is dynamically changing, which is the typical use-case for live streams.5 When the content becomes easier to encode, the encoder would achieve a higher encoding speed than the target encoding speed. This, in turn, introduces unnecessary CPU idle time as it waits for the video feed. 3 https://www.videolan.org/developers/x264.html, last access: Nov 30, 2022. 4 https://www.videolan.org/developers/x265.html, last access: Nov 30, 2022. 5 Qingxiong Huangyuan et al. “Performance evaluation of H.265/MPEG-HEVC encoders for 4K video sequences”. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. 2014, pp. 1–8. doi: 10.1109/APSIPA.2014.7041782. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 5
  • 6. Research Problem Research Problem For easy-to-encode content, encoder preset need to be configured such that encoding speed can be reduced while still being compatible with the expected live encoding speed, improving the quality of the encoded content. When the content becomes complex again, the encoder preset need to be reconfigured to move back to the faster configuration that achieves live encoding speed.6 This paper targets an encoding scheme that determines the encoding preset configuration dy- namically, which: is adaptive to the video content. maximizes the CPU utilization for a given target encoding speed. maximizes the compression efficiency. 6 Sergey Zvezdakov, Denis Kondranin, and Dmitriy Vatolin. “Machine-Learning-Based Method for Content-Adaptive Video Encoding”. In: 2021 Picture Coding Symposium (PCS). 2021, pp. 1–5. doi: 10.1109/PCS50896.2021.9477507. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 6
  • 7. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Content-Adaptive encoder Preset prediction Scheme (CAPS) Bitrate Ladder Video Segment Video Complexity Feature Extraction E h L Bitrate set (B) Resolution set (R) Target speed Apriori information (e.g., codec, CPU threads) Encoding Preset Prediction Encoder Encoder Encoder Encoder … rep1 rep2 repN repN-1 Figure: The encoding pipeline using CAPS envisioned in this paper. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 7
  • 8. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction CAPS Phase 1: Video Complexity Feature Extraction Compute texture energy per block A DCT-based energy function is used to determine the block-wise feature of each frame defined as: Hk = w−1 X i=0 w−1 X j=0 e|( ij wh )2−1| |DCT(i, j)| (1) where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i + j > 0, and 0 otherwise. The energy values of blocks in a frame are averaged to determine the energy per frame.7,8 Es = K−1 X k=0 Hs,k K · w2 (2) 7 Michael King, Zinovi Tauber, and Ze-Nian Li. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983. 8 Vignesh V Menon et al. “Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2174–2178. doi: 10.1109/ICIP42928.2021.9506092. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 8
  • 9. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 1: Video Complexity Feature Extraction CAPS Phase 1: Video Complexity Feature Extraction hs: SAD of the block level energy values of frame s to that of the previous frame s − 1. hs = K−1 X k=0 | Hs,k, Hs−1,k | K · w2 (3) where K denotes the number of blocks in frame s. The luminescence of non-overlapping blocks k of sth frame is defined as: Ls,k = p DCT(0, 0) (4) The block-wise luminescence is averaged per frame denoted as Ls as shown below. Ls = K−1 X k=0 Ls,k K · w2 (5) Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 9
  • 10. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction E h L Model set log(r) log(b) ̂ 𝑡!!"# . . . ̂ 𝑡!!$% Preset selection ̂ 𝑝 Figure: Encoding Preset Prediction architecture. Model Set: models to predict the encoding time for each preset. Preset selection: select the optimized preset that yields the target encoding time. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 10
  • 11. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction Model Set Random Forest (RF) models are trained to predict the encoding times for the pre-defined set of encoding presets (P). The minimum and maximum encoder preset (pmin and pmax , respectively) are chosen based on the target encoder. For example, x265 HEVC9 encoder supports encoding presets ranging from 0 to 9 (i.e., ultrafast to placebo). The model set predicts the encoding times for each of the presets in P as t̂pmin to t̂pmax . 9 G. J. Sullivan et al. “Overview of the high efficiency video coding (HEVC) standard”. In: IEEE Transactions on circuits and systems for video technology 22.12 (2012), pp. 1649–1668. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 11
  • 12. Content-Adaptive Encoder Preset Prediction Scheme (CAPS) Phase 2: Encoding Preset Prediction CAPS Phase 2: Encoding Preset Prediction Preset Selection The preset is selected, which is closest to the target encoding time T, which is defined as: T = n f (6) where n represents the number of frames in the segment. The function ensures that the encoding time is not greater than the target encoding time T. t̂ = argminp | T − tp | c.t. p ∈ [pmin, pmax ]; t̂ ≤ T (7) where p is the selected optimum preset and t̂ is the encoding time for p. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 12
  • 13. Evaluation Test Methodology Test Methodology Dataset: Video Complexity Dataset (VCD)10 (500 Ultra HD sequences, 5 s duration, 24 fps). Table: Encoder Settings. Encoder x265 v3.5 Target encoding speed/time 24 fps/ 5 seconds CPU threads 8 Ratecontrol CBR Table: Representations considered in this paper2 . Representation ID 01 02 03 04 05 06 07 08 09 10 11 12 r (height in pixels) 360 432 540 540 540 720 720 1080 1080 1440 2160 2160 b (in Mbps) 0.145 0.300 0.600 0.900 1.600 2.400 3.400 4.500 5.800 8.100 11.600 16.800 E, h, and L features are extracted from the video segments using VCA11 run in eight CPU threads. 10 Hadi Amirpour et al. “VCD: Video Complexity Dataset”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland: Association for Computing Machinery, 2022, 234–239. isbn: 9781450392839. doi: 10.1145/3524273.3532892. url: https://doi.org/10.1145/3524273.3532892. 11 Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. MMSys ’22. Athlone, Ireland: Association for Computing Machinery, 2022, 259–264. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 13
  • 14. Evaluation Experimental Results Experimental Results Figure: Average relative importance of features in the encoding time prediction of 2160p en- coding for all presets. Preset 0 1 2 3 4 5 6 7 8 R2 0.97 0.96 0.97 0.97 0.97 0.96 0.97 0.98 0.98 MAE 0.06 0.09 0.12 0.17 0.22 0.31 0.33 0.41 0.49 Table: Average encoding time prediction performance results of every preset considered for experimental vali- dation. (p = 0: ultrafast) Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 14
  • 15. Evaluation Experimental Results Experimental Results Figure: Average preset chosen for each representation in the HLS bitrate ladder. On average, representation 01 (0.145 Mbps) chooses slow preset (p = 6). On average, representations 11 (11.6 Mbps) and 12 (16.8 Mbps) choose ultrafast preset (p = 0). Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 15
  • 16. Evaluation Experimental Results Experimental Results (a) (b) (c) Figure: (a) Average encoding time, (b) Average PSNR, and (c) Average VMAF for each representation. Using ultrafast preset for all representations introduces significant CPU idle time for lower bitrate representations. However, CAPS yield lower CPU idle time when the encodings are carried out concurrently. Using CAPS, visual quality improves significantly at lower bitrate representations. CAPS yields an overall BD-VMAF of 3.81 and BD-PSNR of 0.83 dB. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 16
  • 17. Conclusions Conclusions This paper proposed CAPS, a content-adaptive encoder preset prediction scheme for adap- tive live streaming applications. CAPS predicts the optimized encoder preset for a given target bitrate, and resolution for each segment, which helps improve the compression-efficiency of video encodings. DCT-energy-based features are used to determine segments’ spatial and temporal complex- ity. CAPS yield lower idle time and an overall quality improvement of 0.83dB PSNR and 3.81 VMAF score with the same bitrate, compared to the fastest preset (ultrafast) x265 encoding of the reference HLS bitrate ladder. Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 17
  • 18. Q & A Q & A Thank you for your attention! Vignesh V Menon (vignesh.menon@aau.at) VCA Vignesh V Menon Content-adaptive Encoder Preset Prediction for Adaptive Live Streaming 18