Abstract: Current per-title encoding schemes encode the same video content at various bitrates and spatial resolutions to find an optimized bitrate ladder for each video content in Video on
Demand (VoD) applications. However, in live streaming applications, a bitrate ladder with fixed bitrate-resolution pairs
is used to avoid the additional latency caused to find optimum
bitrate-resolution pairs for every video content. This paper introduces an online per-title encoding scheme (OPTE) for live
video streaming applications. In this scheme, each target bitrate’s optimal resolution is predicted from any pre-defined
set of resolutions using Discrete Cosine Transform (DCT)-
energy-based low-complexity spatial and temporal features
for each video segment. Experimental results show that, on average, OPTE yields bitrate savings of 20.45% and 28.45%
to maintain the same PSNR and VMAF, respectively, compared to a fixed bitrate ladder scheme (as adopted in current live streaming deployments) without any noticeable additional latency in streaming.
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
1. OPTE: Online Per-title Encoding for Live Video Streaming
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer
Christian Doppler Laboratory ATHENA, Institute of Information Technology (ITEC), University of Klagenfurt, Austria
2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
24 May 2022
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 1
2. Outline
1 Background
2 Per-title Encoding
3 OPTE
4 Evaluation
5 Conclusions
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 2
4. Background
Background
HTTP Adaptive Streaming (HAS)1
Why Adaptive Streaming?
Adapt for a wide range of devices.
Adapt for a broad set of Internet speeds.
What HAS does?
Each source video is split into segments.
Encoded at multiple bitrates, resolutions, and codecs.
Delivered to the client based on the device capability, network speed etc.
1
A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019),
pp. 562–585. doi: 10.1109/COMST.2018.2862938.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 4
7. Per-title Encoding
Per-title Encoding
HTTP Adaptive Streaming (HAS) continues to grow and has become the de-facto standard
in recent years for delivering video over the Internet.
In HAS, each video is encoded at a set of bitrate-resolution pairs, referred to as bitrate
ladder. Traditionally, a fixed bitrate ladder, e.g., HTTP Live Streaming (HLS) bitrate
ladder,2 is used for all video contents.
However, due to the vast diversity in video content characteristics and network condi-
tions, the “one-size-fits-all” can be optimized per title to increase the Quality of Experi-
ence (QoE).3
2
Apple Inc. HLS Authoring Specification for Apple Devices.
https://developer.apple.com/documentation/http_live_streaming/hls_authoring_specification_for_apple_devices.
3
J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016.
doi: 10.1109/ICIP.2016.7532605.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 7
8. Per-title Encoding
Per-title Encoding
Figure: Rate-Distortion (RD) curves using VMAF as the quality metric of Beauty and Golf sequences
of UVG and BVI datasets4,5
encoded at 540p and 1080p resolutions.
4
Alexandre Mercat, Marko Viitanen, and Jarno Vanne. “UVG Dataset: 50/120fps 4K Sequences for Video Codec Analysis and Development”. In: Proceedings
of the 11th ACM Multimedia Systems Conference. New York, NY, USA: Association for Computing Machinery, 2020, 297–302. isbn: 9781450368452. url:
https://doi.org/10.1145/3339825.3394937.
5
Alex Mackin, Fan Zhang, and David R. Bull. “A study of subjective video quality at various frame rates”. In: 2015 IEEE International Conference on Image
Processing (ICIP). 2015, pp. 3407–3411. doi: 10.1109/ICIP.2015.7351436.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 8
9. Per-title Encoding
Per-title Encoding
Determining convex-hull is computationally very expensive, making it suitable for only VoD
streaming applications.
State-of-the-art per-title approaches yield latency much higher than the accepted latency
in live streaming.6,7,8,9
6
Bitmovin. “White Paper: Per Title Encoding”. In: 2018. url: https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf.
7
A. V. Katsenou, J. Sole, and D. R. Bull. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium
(PCS). 2019. doi: 10.1109/PCS48520.2019.8954529.
8
Madhukar Bhat, Jean-Marc Thiesse, and Patrick Le Callet. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality
Range: A Case Study”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310.
9
Madhukar Bhat, Jean-Marc Thiesse, and Patrick Le Callet. “A Case Study of Machine Learning Classifiers for Real-Time Adaptive Resolution Prediction in
Video Coding”. In: 2020 IEEE International Conference on Multimedia and Expo (ICME). 2020, pp. 1–6. doi: 10.1109/ICME46284.2020.9102934.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 9
12. OPTE Phase 1: Feature Extraction
OPTE
Phase 1: Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame is averaged to determine the energy per frame.10
E =
C−1
X
k=0
Hp,k
C · w2
(2)
10
Michael King, Zinovi Tauber, and Ze-Nian Li. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on
Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 12
13. OPTE Phase 1: Feature Extraction
Proposed Algorithm
Phase 1: Feature Extraction
hp: SAD of the block level energy values of frame p to that of the previous frame p − 1.
hp =
C−1
X
k=0
SAD(Hp,k, Hp−1,k)
C · w2
(3)
where C denotes the number of blocks in frame p.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 13
14. OPTE Phase 2: Resolution prediction
OPTE
Phase 2: Resolution prediction
Inputs:
f : original framerate
rmax : original spatial resolution
R : set of all resolutions
B : set of all target bitrates
Output: ˆ
r(b) ∀ b ∈ B
Compute E, h features.
for each b ∈ B do
Determine the optimized resolution.
ˆ
r(b) = rmax · (1 − s0e−
ΓMA(rmax ,f )·h·b
E )
Map ˆ
r(b) to its closest value in R.
Please note that ˆ
r is an exponentially decaying (increasing) function.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 14
15. OPTE Phase 2: Resolution prediction
OPTE
Determining ΓMA
The half-life of the ˆ
r function is evaluated, i.e., the bitrate when ˆ
r becomes rmax
2 . ˆ
r is an
exponentially decaying (increasing) function where:
b1
2
=
ln(2)
K
(4)
ΓMA can thus be determined as:
ΓMA =
ln(2) · E
h · b1
2
(5)
ΓMA values obtained for the training sequences of each resolution and framerate is averaged to
determine ΓMA(rmax , f ).
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 15
17. Evaluation
Evaluation
E and h values are extracted using VCA open-source software11 at a speed of 370fps for
UHD content.
Table: Results of OPTE against fixed bitrate ladder approach.
Dataset Video f SI TI BDRV BDRP
MCML Bunny 30 23.38 6.43 -39.48% -32.25%
MCML Characters 30 50.43 29.85 -51.90% -68.81%
MCML Crowd 30 33.76 10.13 -29.82% -14.18%
MCML Dolls 30 16.88 19.91 -1.43% -8.49%
SJTU BundNightScape 30 48.82 7.06 -61.22% -60.86%
SJTU Fountains 30 43.37 11.42 -32.93% -8.49%
SJTU TrafficFlow 30 33.57 13.80 -50.54% -40.90%
SJTU TreeShade 30 52.88 5.29 -47.76% -38.55%
VQEG CrowdRun 50 50.77 22.33 -8.50% -1.90%
VQEG DucksTakeOff 50 47.77 15.10 -2.99% -2.79%
VQEG IntoTree 50 24.41 12.09 -26.50% -5.75%
VQEG OldTownCross 50 29.66 11.62 -30.91% -22.53%
VQEG ParkJoy 50 62.78 27.00 -12.08% -2.62%
JVET CatRobot 60 44.45 11.84 -13.43% -5.95%
JVET DaylightRoad2 60 40.51 16.21 -27.52% -9.35%
JVET FoodMarket4 60 38.26 17.68 -18.11% -3.74%
Average -28.45% -20.45%
11
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839.
doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 17
18. Evaluation
Evaluation
0 5 10 15
Bitrate (in Mbps)
40
60
80
100
VMAF
Default
OPTE
(a) BundNightScape
0 5 10 15
Bitrate (in Mbps)
40
60
80
100
VMAF
Default
OPTE
(b) Bunny
Figure: Comparison of RD curves for encoding the first segment of BundNightScape and Bunny sequences
using the fixed bitrate ladder and OPTE.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 18
20. Conclusions
Conclusions
Proposed OPTE, an online per-title encoding scheme for live streaming applications.
DCT-energy-based features are used to determine the spatial and temporal complexity of
the video, which is fast and effective.
Predicts optimized resolutions for a set of bitrates defined in a bitrate ladder for every
video, which helps in improving the overall performance (QoE) of UHD video streaming.
Live streaming using OPTE requires 20.45% fewer bits to maintain the same PSNR and
28.45% fewer bits to maintain the same VMAF.
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 20
21. Conclusions
Q & A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
Hadi Amirpour (hadi.amirpour@aau.at)
Mohammad Ghanbari (ghan@essex.ac.uk)
Christian Timmerer (christian.timmerer@aau.at)
Vignesh V Menon OPTE: Online Per-title Encoding for Live Video Streaming 21