Vignesh V Menon is invited to talk on "Video Coding for HTTP Adaptive Streaming" on the Research@Lunch, which is a research webinar series by Humanitarian Technology (HuT) Labs, Amrita Vishwa Vidyapeetham University, India, exclusively for Ph.D. Scholars, UG, and PG Researchers in India. This talk will introduce the basics of video codecs and highlight the scope of HAS-related research on video encoding.
Video Coding Enhancements for HTTP Adaptive Streaming
1. Video Coding Enhancements for HTTP Adaptive Streaming
Author : Vignesh V Menon
Degree programme : Doctoral Programme in Technical Sciences Informatics
Supervisor : Assoc.-Prof. DI Dr. Christian Timmerer
Date : 14 August 2021
Vignesh V Menon Research @ Lunch 1
2. Outline
1 Biography
2 About CD Lab ATHENA
3 Introduction
4 Research Questions
5 Research Methodology
6 Publications
Vignesh V Menon Research @ Lunch 2
4. Biography
Biography
Current Position:
Research Assistant, Christian Doppler laboratory ATHENA (AdapTive Streaming over
HTTP and Emerging Networked MultimediA Services).
PhD Candidate, Institute of Information Technology (ITEC), Alpen-Adria-Universität Kla-
genfurt (AAU).
Education:
B.Tech. Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham Uni-
versity (2016).
M.Sc. Information and Network Engineering, KTH Royal Institute of Technology (2020).
Work experience:
Video Software Engineer, MulticoreWare Inc., India, between 2016-2018.
Video Software Developer, Divideon, Sweden, between 2018-2020.
Research interests:
Video streaming, Image, and Video compression.
Vignesh V Menon Research @ Lunch 4
5. About CD Lab ATHENA
About CD Lab ATHENA
Vignesh V Menon Research @ Lunch 5
6. About CD Lab ATHENA
About CD Lab ATHENA
Jointly proposed by the Institute of Information Technology (ITEC) at Alpen-Adria-Universität
Klagenfurt (AAU) and Bitmovin GmbH.
Work on addressing current and future research and deployment challenges of HTTP Adap-
tive Streaming (HAS) and emerging streaming methods.
Research and develop novel paradigms, approaches, (prototype) tools, and evaluation re-
sults for the phases:
multimedia content provisioning.
content delivery.
content consumption in the media delivery chain.
end-to-end aspects, with a focus on, but not being limited to HAS.
Vignesh V Menon Research @ Lunch 6
8. Introduction
Introduction
Premise for efficient video compression
According to the Cisco Visual Networking Index forecast, global Internet video traffic is
expected to exceed 82% in 2022.1
During the period of COVID-19, many theatrical releases went to Video-on-Demand (VoD).2
More than 250 cinematic titles are now available from major studios in Ultra High Definition
(UltraHD),3 and PyeongChang 2018 Olympic Winter Games and FIFA World Cup 2018 are
both streamed live in UltraHD HDR.4
1
Cisco. “Cisco visual networking index: Forecast and methodology, 2017–2022 (White Paper)”. In: (2019).
2
T. Ruether. “VOD Streaming: What It Is and How It Relates to OTT”. In: May 2020. url:
https://www.wowza.com/blog/vod-streaming-what-it-is-and-how-it-relates-to-ott.
3
UltraHD Forum. “End-to-end guidelines for Phase A implementation, v1.4”. In: Sept. 2017. url:
https://ultrahdforum.org/resources/phasea-guidelinesdescription/.
4
T. Fautier. “UHD Worldwide Service Deployment Update”. In: Apr. 2018. url:
https://ultrahdforum.org/wp-content/uploads/UHD-Worldwide-Service-Deployment-Update-April-2018.pdf.
Vignesh V Menon Research @ Lunch 8
9. Introduction
Introduction
HTTP Adaptive Streaming (HAS)5
5
A. Bentaleb et al. “A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP”. In: IEEE Communications Surveys Tutorials 21.1 (2019),
pp. 562–585.
Vignesh V Menon Research @ Lunch 9
11. Introduction
Introduction
HTTP Adaptive Streaming (HAS)
Source: https://bitmovin.com/adaptive-streaming/
Why Adaptive Streaming?
Adapt for a wide range of devices.
Adapt for a broad set of Internet speeds.
What HAS does?
Each source video is split into segments.
Encoded at multiple bitrates, resolutions,
and codecs.
Delivered to the client based on the device
capability, network speed etc.
Vignesh V Menon Research @ Lunch 11
13. Introduction
Introduction
Video encoding pipeline
Raw Video
Block
Partitioning
Prediction
(subtract)
Motion
compensation
Transformation
Entropy Coding
Compressed
video bitstream
Entropy
Decoding
Inverse
Transformation
Prediction
(add)
Decoded
video
Figure: Video compression pipeline
Vignesh V Menon Research @ Lunch 13
15. Research Questions
Research Questions
RQ-1
How to efficiently provide multi-bitrate, multi-resolution representations for HAS?
The demand for high resolution and high bitrate content is rising.
More devices are introduced to cater to the demand for high-quality video content at various
resolutions and bitrates.
The number of representations needed for HAS is increasing.
Encoder decisions like slice type decisions, block-partitioning, prediction mode etc., are
redundant across the representations.
Goal: Reduce the increasing computational complexity while maintaining the compression
efficiency of standalone encodings.
Vignesh V Menon Research @ Lunch 15
16. Research Questions
Research Questions
RQ-1
Relative encoding time of the representations normalized to encoding time of the 2160p
25Mbps representation.
500 1000 1500 2000 3000 4500 5800 7000 11600 16800 20000 25000
Bitrate (in kbps)
0
20
40
60
80
100
Relative Time Complexity (in percentage)
540p
1080p
2160p
As resolution doubles, encoding time complexity doubles!
Many encoder analysis decisions are redundant across the representations.
Multi-rate: Exploit this redundancy across representations of a resolution.
Multi-resolution: Exploit this redundancy across resolutions.
Vignesh V Menon Research @ Lunch 16
17. Research Questions
Research Questions
RQ-2
How to efficiently provide multi-codec representations for HAS?
HAS serves multiple types of devices:
A subset of devices can only decode the previous generation codec (e.g., AVC6
).
A subset of devices can only decode the current generation codec (e.g., HEVC7
).
A subset of devices can decode both codecs and can also seamlessly switch between them.
Representations of both codecs should be stored to address multiple clients.
Goal: Reduce the increased computational complexity of encoding representations of mul-
tiple codecs without compromising the overall quality of such a system.
6
T. Wiegand et al. “Overview of the H.264/AVC video coding standard”. In: IEEE Transactions on Circuits and Systems for Video Technology 13.7 (2003),
pp. 560–576.
7
G. J. Sullivan et al. “Overview of the high efficiency video coding (HEVC) standard”. In: IEEE Transactions on circuits and systems for video technology
22.12 (2012), pp. 1649–1668.
Vignesh V Menon Research @ Lunch 17
18. Research Questions
Research Questions
RQ-3
How to improve the encoder performance using content-adaptive algorithms?
Quest to achieve a perfect trade-off between perceptual quality and compression efficiency
inspires us to find the most effective way for optimal bit allocation for a video.
By allocating only the required bits for a given video, based on its complexity, Content
Adaptive Encoding (CAE) can drive significant bitrate savings.8
Goal: Derive content-adaptive spatial and temporal features for the video, which we can
later use to influence encoder decisions like slice-type, quantization parameter, block par-
titioning, frame-rate, and much more.
8
J. Shingala and P. Dixit. “Content Adaptive Encoding: Key Decisions for an Effective Solution”. In: Feb. 2018. url:
https://www.ittiam.com/content-adaptive-encoding-key-decisions-effective-solution/.
Vignesh V Menon Research @ Lunch 18
20. Research Methodology
Research Methodology
We follow the traditional design and abstraction methodology9 in this doctoral study to
answer the research questions.
We follow the agile methodology in the implementation of solutions.10
Agile methodology
Algorithm conceptualization
Algorithm implementation
Quantitative and qualitative analysis of the algorithm
Receive feedback and re-iterate
9
D. E. Comer et al. “Computing as a discipline”. In: Communications of the ACM 32.1 (1989), pp. 9–23.
10
Encarna Abellan. “What’s the Agile Methodology and How Can It Benefit Your Enterprise?” In: Feb. 2020. url:
https://www.wearemarketing.com/blog/what-is-the-agile-methodology-and-what-benefits-does-it-have-for-your-company.html.
Vignesh V Menon Research @ Lunch 20
22. Publications
Publications
List of Publications
Table: List of Publications
ID Title Authors Conference
1 Efficient Multi-Encoding Algo-
rithms for HTTP Adaptive Bi-
trate Streaming
V. V. Menon, H. Amirpour, C.
Timmerer, and M. Ghanbari
Picture Coding Symposium
(PCS) 2021
2 Efficient Content-Adaptive
Feature-based Shot Detection for
HTTP Adaptive Streaming
V. V. Menon, H. Amirpour, M.
Ghanbari, and C. Timmerer
International Conference on
Image Processing (ICIP) 2021
3 INCEPT: INTRA CU Depth Pre-
diction for HEVC
V. V. Menon, H. Amirpour, C.
Timmerer, and M. Ghanbari
International Workshop on
Multimedia Signal Processing
(MMSP) 2021
4 EMES: Efficient Multi-Encoding
Schemes for HEVC-based Adap-
tive Bitrate Streaming
V. V. Menon, H. Amirpour, M.
Ghanbari, and C. Timmerer
Under review in IEEE Access
Vignesh V Menon Research @ Lunch 22
23. Publications
Details of Publications
Paper 1
RQ-1: Proposed efficient multi-encoding algorithms for HTTP Adaptive Streaming, tested with
the open-source x265 HEVC encoder.
We store the encoder analysis information of the reference representation as metadata and
accelerate the encoding of other representations by reducing the search for optimal encoder
decisions in those representations.
We proposed novel encoder analysis reuse schemes tailor-made for highest time savings and
highest compression efficiency.11
11
V. V. Menon et al. “Efficient Multi-Encoding Algorithms for HTTP Adaptive Bitrate Streaming”. In: Picture Coding Symposium (PCS) (2021).
Vignesh V Menon Research @ Lunch 23
24. Publications
Details of Publications
Paper 1
Table: Results for the multi-rate algorithms
Algorithm ∆T BDRP
BDRP
∆T BDRV
BDRV
∆T
Single-bound for CU estimation 14.50% -0.59% -4.39% 0.10% 1.08%
Double-bound for CU estimation 26.37% 0.73% 3.44% 1.25% 5.54%
x265 Algorithm-1 18.61% 1.67% 9.52% 1.56% 8.85%
x265 Algorithm-2 55.11% 8.42% 15.62% 8.72% 16.10%
Multi-rate Algorithm-1 (ours) 26.50% -0.28% -0.84% 0.43% 2.31%
Multi-rate Algorithm-2 (ours) 37.37% 1.06% 3.36% 1.57% 4.88%
Table: Results for the multi-encoding algorithms.
Algorithm ∆T BDRP BDRV
State-of-the-art 80.05% 13.53% 9.59%
Multi-encoding Algorithm-1 (ours) 39.72% 2.32% 1.55%
Multi-encoding Algorithm-2 (ours) 50.90% 3.45% 2.63%
BDRP, BDRV : Average difference in bitrate to reference encode to maintain the same PSNR
and VMAF, respectively.
Vignesh V Menon Research @ Lunch 24
25. Publications
Details of Publications
Paper 2
RQ-3: We proposed a shot detection algorithm for VoD HAS applications using the HEVC
standard.
(a) Frame 255 (b) Frame 262 (c) Frame 269
Figure: snow mnt frames 255 to 269 (benchmark algorithm missed this shot transition)
Vignesh V Menon Research @ Lunch 25
26. Publications
Details of Publications
Paper 2: Multi-shot encoding framework for VoD HAS applications12
Input Video Shot Detection
Shot Encodings
Video Quality Measure
Convex Hull Determination
Encoding Set Generation
Multi-shot Encoding
Encoded Shots
Bitrate Quality Pairs
Bitrate Resolution Pairs
Target Encoding Set
12
Venkata Phani Kumar M, Christian Timmerer, and Hermann Hellwagner. “MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming”. In:
2020 IEEE International Conference on Multimedia and Expo (ICME). 2020, pp. 1–6. doi: 10.1109/ICME46284.2020.9102775.
Vignesh V Menon Research @ Lunch 26
27. Publications
Details of Publications
Paper 2
Proposed a shot detection algorithm as a feature-based pre-processing step for x265-based
HEVC encoding in VoD HAS applications.
Identified a DCT-based energy function as a feature to determine shot cuts.
Proposed a successive elimination algorithm to remove the false detections during gradual
shot transitions.
Recall rate of 25% and an F-measure of 20% greater than the benchmark algorithm.13
13
V. V. Menon et al. “Efficient Content-Adaptive Feature-based Shot Detection For HTTP Adaptive Streaming”. In: IEEE International Conference on Image
Processing (ICIP) (2021).
Vignesh V Menon Research @ Lunch 27
28. Publications
Details of Publications
Paper 3
RQ-3: We proposed a fast intra CU depth prediction (INCEPT) algorithm for HEVC encoding.
Start CTU
Depth i ∈
[dmin, dmax]
No
d > dmax
End CTU
PU mode decisions
i = i + 1 for next
CU
No
Yes
Yes
Figure: Quad-tree CU algorithm for partitioning of a CTU.
Vignesh V Menon Research @ Lunch 28
29. Publications
Details of Publications
Paper 3
Develops an encoding time complexity reductional algorithm designed explicitly for HEVC
intra coding by providing an efficient solution for CU size decisions.
Discusses existing benchmark algorithms and proposes an intra CU Depth Prediction (IN-
CEPT) algorithm that provides a better trade-off in reducing the encoding time and main-
taining the same compression efficiency.
In INCEPT, DCT energy-based spatial feature that has better accuracy in predicting the
texture of each CTU is compared to the features mentioned in the literature.
The CU depth statistics of the neighboring CTUs are used to improve the CU depth pre-
diction accuracy.
Vignesh V Menon Research @ Lunch 29
30. Publications
Details of Publications
Paper 3
Table: ∆T and BDR comparison between the INCEPT algorithm and the benchmark algorithms.
ADTS14 SCDP15 INCEPT16
Video ∆T BDRP BDRV ∆T BDRP BDRV ∆T BDRP BDRV
CatRobot 13.74% 2.36% 2.02% 24.97% 3.89% 3.71% 24.75% 3.08% 3.25%
DaylightRoad 16.38% 1.25% 1.19% 26.30% 3.15% 2.44% 26.20% 1.72% 1.54%
FoodMarket 16.15% 1.06% 1.12% 19.00% 2.56% 1.26% 20.09% 1.40% 0.72%
Basketball 13.75% 1.96% 1.68% 18.69% 4.82% 3.29% 19.13% 2.16% 1.88%
Bunny 15.40% 1.98% 2.03% 18.11% 3.32% 3.09% 18.69% 1.67% 1.69%
Lake 13.01% 1.08% 0.97% 22.54% 2.95% 2.12% 22.89% 1.19% -2.25%
BundNightScape 16.68% 1.08% 0.99% 28.18% 2.95% 4.10% 28.02% 1.43% 1.73%
CampfireParty 12.27% 0.78% 1.12% 22.11% 1.88% 2.64% 23.41% 0.82% 1.36%
Fountains 17.61% 0.88% 1.07% 25.12% 2.72% 2.66% 26.90% 1.53% 1.61%
Average 15.00% 1.38% 1.35% 22.78% 3.14% 2.81% 23.34% 1.67% 1.28%
14
Xin Lu, Chang Yu, and Xuesong Jin. “A fast HEVC intra-coding algorithm based on texture homogeneity and spatio-temporal correlation”. In: EURASIP
Journal on Advances in Signal Processing 37 (2018). doi: https://doi.org/10.1186/s13634-018-0558-4.
15
Yun Zhang et al. “Statistical Early Termination and Early Skip Models for Fast Mode Decision in HEVC INTRA Coding”. In: ACM Trans. Multimedia
Comput. Commun. Appl. 15.3 (July 2019). issn: 1551-6857. doi: 10.1145/3321510. url: https://doi.org/10.1145/3321510.
16
V. V. Menon et al. “INCEPT: Intra CU Depth Prediction for HEVC”. In: Accepted for Publication in IEEE 23rd Workshop on Multimedia Signal Processing
(MMSP). Oct. 2021.
Vignesh V Menon Research @ Lunch 30
31. Publications
Target Publication Venues
Table: Possible targets for publications.
Name Type Rank
IEEE International Conference on Visual Communications and Image Processing (VCIP) Conference B1
IEEE International Conference on Image Processing (ICIP) Conference B1
IEEE International Conference on Multimedia and Expo (ICME) Conference A1
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) Conference A12
Elsevier Signal Processing: Image Communication (SPIC) Journal Q13
IEEE Transactions on Circuits and Systems for Video Technology (CSVT) Journal Q13
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) Journal Q13
IEEE Transactions on Multimedia (TMM) Journal Q13
1 http://portal.core.edu.au/conf-ranks/
2 http://www.conferenceranks.com/?searchall=ICASSP#data
3 https://www.scimagojr.com/journalrank.php?type=j
Vignesh V Menon Research @ Lunch 31
32. Q & A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
Vignesh V Menon Research @ Lunch 32