OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

OPSE: Online Per-Scene Encoding for Adaptive HTTP Live
Streaming
Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and
Christian Timmerer1
1
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria
2
Bitmovin, Klagenfurt, Austria
3
School of Computer Science and Electronic Engineering, University of Essex, UK
21 July 2022
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
Outline
1 Introduction
2 OPSE
3 Evaluation
4 Q & A
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
Introduction
Motivation
Per-scene encoding schemes are based on the fact that each resolution performs better
than others in a scene for a given bitrate range, and these regions depend on the video
complexity.
Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as
introduced for VoD services.1
Figure: The bitrate ladder prediction envisioned using OPSE.
1
J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016,
pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
Introduction
Why not in live yet?
Though per-title encoding schemes2 enhance the quality of video delivery, determining the
convex-hull is computationally expensive, making it suitable for only VoD streaming
applications.
Some methods pre-analyze the video contents3.
Katsenou et al.4
introduced a content-gnostic method that employs machine learning to find
the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5
proposed a Random Forest (RF) classifier to decide encoding resolution best suited over
different quality ranges and studied machine learning based adaptive resolution prediction.
However, these approaches still yield latency much higher than the accepted latency in
live streaming.
2
De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal
Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247.
3
https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022.
4
A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi:
10.1109/PCS48520.2019.8954529.
5
Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE
International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
OPSE
OPSE
Input Video
Video Complexity
Feature Extraction
Scene Detection
Resolution
Prediction
Resolutions (R)
Bitrates (B)
Per-Scene
Encoding
(E, h, ϵ)
(E, h)
Scenes (ˆ
r, b)
Figure: OPSE architecture.
E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6
6
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839.
doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
OPSE
OPSE
Phase 1: Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame is averaged to determine the energy per frame.7
E =
C−1
X
k=0
Hp,k
C · w2
(2)
7
Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007,
pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
OPSE
OPSE
Phase 1: Feature Extraction
hp: SAD of the block level energy values of frame p to that of the previous frame p − 1.
hp =
C−1
X
k=0
| Hp,k, Hp−1,k |
C · w2
(3)
where C denotes the number of blocks in frame p.
The gradient of h per frame p, ϵp is also defined, which is given by:
ϵp =
hp−1 − hp
hp−1
(4)
Latency
Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86
SIMD optimization
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
OPSE
OPSE
Phase 2: Scene Detection
Objective:
Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh
(IDR) frame.
Encode the subsequent frames of the new shot based on the first one via motion compen-
sation and prediction.
Shot transitions can be present in two ways:
hard shot-cuts
gradual shot transitions
The detection of gradual changes is much more difficult owing to the fact it is difficult to
determine the change in the visual information in a quantitative format.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
OPSE
OPSE
Phase 2: Scene Detection
Step 1: while Parsing all video frames do
if ϵk > T1 then
k ← IDR-frame, a new shot.
else if ϵk ≤ T2 then
k ← P-frame or B-frame, not a new shot.
T1 , T2 : maximum and minimum threshold for ϵk
f : video fps
Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3
q0: current frame number in the set Q
q−1: previous frame number in the set Q
q1: next frame number in the set Q
Step 2: while Parsing Q do
if q0 − q−1 > f and q1 − q0 > f then
q0 ← IDR-frame, a new shot.
Eliminate q0 from Q.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
OPSE
OPSE
Phase 3: Resolution Prediction
For each detected scene, the optimized bitrate ladder is predicted using the E and h features
of the first GOP of each scene and the sets R and B. The optimized resolution ˆ
r is predicted
for each target bitrate b ∈ B. The resolution scaling factor s is defined as:
s =
 r
rmax

; r ∈ R (5)
where rmax is the maximum resolution in R.
Hidden Layer
E R4
Hidden Layer
E R4
Input Layer
E R3
Output Layer
E R1
E
h
log(b)
ŝ
Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum
resolution rmax and framerate f .
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
Evaluation
Evaluation
R = {360p, 432p, 540p, 720p, 1080p}
B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}.
Figure: BDRV results for scenes characterized by various average E and h.
BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations
compared with that of the fixed bitrate ladder encoding to maintain the same VMAF.
8
G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001).
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
Evaluation
Evaluation
(a) Scene1 (b) Scene2
Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and
Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
Q  A
Q  A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13
1 of 13

Recommended

TQPM.pdf by
TQPM.pdfTQPM.pdf
TQPM.pdfVignesh V Menon
14 views19 slides
Perceptually-aware Per-title Encoding for Adaptive Video Streaming by
Perceptually-aware Per-title Encoding for Adaptive Video StreamingPerceptually-aware Per-title Encoding for Adaptive Video Streaming
Perceptually-aware Per-title Encoding for Adaptive Video StreamingAlpen-Adria-Universität
695 views15 slides
OPTE: Online Per-title Encoding for Live Video Streaming.pdf by
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfVignesh V Menon
78 views21 slides
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf by
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfPerceptually-aware Per-title Encoding for Adaptive Video Streaming.pdf
Perceptually-aware Per-title Encoding for Adaptive Video Streaming.pdfVignesh V Menon
17 views15 slides
CAPS_Presentation.pdf by
CAPS_Presentation.pdfCAPS_Presentation.pdf
CAPS_Presentation.pdfVignesh V Menon
833 views18 slides
Green_VCA_presentation.pdf by
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdfVignesh V Menon
33 views16 slides

More Related Content

Similar to OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

CODA_presentation.pdf by
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdfJunZhao68
1 view16 slides
LiveVBR presentation at VQEG NORM.pdf by
LiveVBR presentation at VQEG NORM.pdfLiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdfVignesh V Menon
38 views19 slides
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC by
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCVignesh V Menon
32 views22 slides
INCEPT: Intra CU Depth Prediction for HEVC by
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCAlpen-Adria-Universität
1.1K views22 slides
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP... by
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...Vignesh V Menon
81 views23 slides
Introduction to Video Compression Techniques - Anurag Jain by
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
3.8K views77 slides

Similar to OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf(20)

CODA_presentation.pdf by JunZhao68
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
JunZhao681 view
LiveVBR presentation at VQEG NORM.pdf by Vignesh V Menon
LiveVBR presentation at VQEG NORM.pdfLiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdf
Vignesh V Menon38 views
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC by Vignesh V Menon
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
Vignesh V Menon32 views
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP... by Vignesh V Menon
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
Vignesh V Menon81 views
Introduction to Video Compression Techniques - Anurag Jain by Videoguy
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
Videoguy3.8K views
Motion Compensation With Prediction Error Using Ezw Wavelet Coefficients by IJERA Editor
Motion Compensation With Prediction Error Using Ezw Wavelet CoefficientsMotion Compensation With Prediction Error Using Ezw Wavelet Coefficients
Motion Compensation With Prediction Error Using Ezw Wavelet Coefficients
IJERA Editor42 views
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming by Alpen-Adria-Universität
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil... by Ijripublishers Ijri
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Survey paper on image compression techniques by IRJET Journal
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniques
IRJET Journal36 views
Machine Learning approaches at video compression by Roberto Iacoviello
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
Roberto Iacoviello333 views
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME... by ijma
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
ijma2 views
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME... by ijma
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
ijma6 views
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil... by Ijripublishers Ijri
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming by Alpen-Adria-Universität
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingMiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive Streaming
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME... by ijma
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
IMPROVING PSNR AND PROCESSING SPEED FOR HEVC USING HYBRID PSO FOR INTRA FRAME...
ijma17 views
Rate Distortion Performance for Joint Source Channel Coding of JPEG image Ove... by CSCJournals
Rate Distortion Performance for Joint Source Channel Coding of JPEG image Ove...Rate Distortion Performance for Joint Source Channel Coding of JPEG image Ove...
Rate Distortion Performance for Joint Source Channel Coding of JPEG image Ove...
CSCJournals286 views

Recently uploaded

Career Building in AI - Technologies, Trends and Opportunities by
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and OpportunitiesWebStackAcademy
41 views44 slides
Create a Structure in VBNet.pptx by
Create a Structure in VBNet.pptxCreate a Structure in VBNet.pptx
Create a Structure in VBNet.pptxBreach_P
82 views8 slides
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv... by
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...Taste
53 views21 slides
Papal.pdf by
Papal.pdfPapal.pdf
Papal.pdfMariaKenney3
57 views24 slides
MercerJesse2.1Doc.pdf by
MercerJesse2.1Doc.pdfMercerJesse2.1Doc.pdf
MercerJesse2.1Doc.pdfjessemercerail
301 views5 slides
CUNY IT Picciano.pptx by
CUNY IT Picciano.pptxCUNY IT Picciano.pptx
CUNY IT Picciano.pptxapicciano
60 views17 slides

Recently uploaded(20)

Career Building in AI - Technologies, Trends and Opportunities by WebStackAcademy
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and Opportunities
WebStackAcademy41 views
Create a Structure in VBNet.pptx by Breach_P
Create a Structure in VBNet.pptxCreate a Structure in VBNet.pptx
Create a Structure in VBNet.pptx
Breach_P82 views
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv... by Taste
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...
Taste53 views
CUNY IT Picciano.pptx by apicciano
CUNY IT Picciano.pptxCUNY IT Picciano.pptx
CUNY IT Picciano.pptx
apicciano60 views
Education of marginalized and socially disadvantages segments.pptx by GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati540 views
Class 9 lesson plans by TARIQ KHAN
Class 9 lesson plansClass 9 lesson plans
Class 9 lesson plans
TARIQ KHAN68 views
Six Sigma Concept by Sahil Srivastava.pptx by Sahil Srivastava
Six Sigma Concept by Sahil Srivastava.pptxSix Sigma Concept by Sahil Srivastava.pptx
Six Sigma Concept by Sahil Srivastava.pptx
Sahil Srivastava40 views
Parts of Speech (1).pptx by mhkpreet001
Parts of Speech (1).pptxParts of Speech (1).pptx
Parts of Speech (1).pptx
mhkpreet00143 views
ANGULARJS.pdf by ArthyR3
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdf
ArthyR349 views
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice by Taste
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a ChoiceCreative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Taste41 views
Narration lesson plan by TARIQ KHAN
Narration lesson planNarration lesson plan
Narration lesson plan
TARIQ KHAN69 views
Monthly Information Session for MV Asterix (November) by Esquimalt MFRC
Monthly Information Session for MV Asterix (November)Monthly Information Session for MV Asterix (November)
Monthly Information Session for MV Asterix (November)
Esquimalt MFRC98 views
INT-244 Topic 6b Confucianism by S Meyer
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b Confucianism
S Meyer44 views

OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

  • 1. OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and Christian Timmerer1 1 Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria 2 Bitmovin, Klagenfurt, Austria 3 School of Computer Science and Electronic Engineering, University of Essex, UK 21 July 2022 Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
  • 2. Outline 1 Introduction 2 OPSE 3 Evaluation 4 Q & A Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
  • 3. Introduction Motivation Per-scene encoding schemes are based on the fact that each resolution performs better than others in a scene for a given bitrate range, and these regions depend on the video complexity. Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as introduced for VoD services.1 Figure: The bitrate ladder prediction envisioned using OPSE. 1 J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016, pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
  • 4. Introduction Why not in live yet? Though per-title encoding schemes2 enhance the quality of video delivery, determining the convex-hull is computationally expensive, making it suitable for only VoD streaming applications. Some methods pre-analyze the video contents3. Katsenou et al.4 introduced a content-gnostic method that employs machine learning to find the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5 proposed a Random Forest (RF) classifier to decide encoding resolution best suited over different quality ranges and studied machine learning based adaptive resolution prediction. However, these approaches still yield latency much higher than the accepted latency in live streaming. 2 De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247. 3 https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022. 4 A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi: 10.1109/PCS48520.2019.8954529. 5 Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
  • 5. OPSE OPSE Input Video Video Complexity Feature Extraction Scene Detection Resolution Prediction Resolutions (R) Bitrates (B) Per-Scene Encoding (E, h, ϵ) (E, h) Scenes (ˆ r, b) Figure: OPSE architecture. E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6 6 Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
  • 6. OPSE OPSE Phase 1: Feature Extraction Compute texture energy per block A DCT-based energy function is used to determine the block-wise feature of each frame defined as: Hk = w−1 X i=0 w−1 X j=0 e|( ij wh )2−1| |DCT(i, j)| (1) where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i + j > 0, and 0 otherwise. The energy values of blocks in a frame is averaged to determine the energy per frame.7 E = C−1 X k=0 Hp,k C · w2 (2) 7 Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
  • 7. OPSE OPSE Phase 1: Feature Extraction hp: SAD of the block level energy values of frame p to that of the previous frame p − 1. hp = C−1 X k=0 | Hp,k, Hp−1,k | C · w2 (3) where C denotes the number of blocks in frame p. The gradient of h per frame p, ϵp is also defined, which is given by: ϵp = hp−1 − hp hp−1 (4) Latency Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86 SIMD optimization Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
  • 8. OPSE OPSE Phase 2: Scene Detection Objective: Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh (IDR) frame. Encode the subsequent frames of the new shot based on the first one via motion compen- sation and prediction. Shot transitions can be present in two ways: hard shot-cuts gradual shot transitions The detection of gradual changes is much more difficult owing to the fact it is difficult to determine the change in the visual information in a quantitative format. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
  • 9. OPSE OPSE Phase 2: Scene Detection Step 1: while Parsing all video frames do if ϵk > T1 then k ← IDR-frame, a new shot. else if ϵk ≤ T2 then k ← P-frame or B-frame, not a new shot. T1 , T2 : maximum and minimum threshold for ϵk f : video fps Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3 q0: current frame number in the set Q q−1: previous frame number in the set Q q1: next frame number in the set Q Step 2: while Parsing Q do if q0 − q−1 > f and q1 − q0 > f then q0 ← IDR-frame, a new shot. Eliminate q0 from Q. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
  • 10. OPSE OPSE Phase 3: Resolution Prediction For each detected scene, the optimized bitrate ladder is predicted using the E and h features of the first GOP of each scene and the sets R and B. The optimized resolution ˆ r is predicted for each target bitrate b ∈ B. The resolution scaling factor s is defined as: s = r rmax ; r ∈ R (5) where rmax is the maximum resolution in R. Hidden Layer E R4 Hidden Layer E R4 Input Layer E R3 Output Layer E R1 E h log(b) ŝ Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum resolution rmax and framerate f . Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
  • 11. Evaluation Evaluation R = {360p, 432p, 540p, 720p, 1080p} B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}. Figure: BDRV results for scenes characterized by various average E and h. BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations compared with that of the fixed bitrate ladder encoding to maintain the same VMAF. 8 G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001). Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
  • 12. Evaluation Evaluation (a) Scene1 (b) Scene2 Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
  • 13. Q A Q A Thank you for your attention! Vignesh V Menon (vignesh.menon@aau.at) Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13