Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming

Förderverein Technische Fakultät
Förderverein Technische FakultätFörderverein Technische Fakultät
Content-gnostic Bitrate Ladder
Prediction forAdaptive Video Streaming
29/09/2020
Angeliki Katsenou
Structure
I. Motivation
II. Compression and Content
III. Proposed Framework
IV. Results
V.Conclusion and Future Work
I. Motivation
Cisco reports on internet data traffic estimate that the video data share is
expected to reach 80% by 2022 and is expected to increase more. [1]
Due to the pandemic and recently shift towards remote work life-style, this figure
is probably almost a reality.
Video providers employ adaptive streaming to address the users specifications.
Traditionally, this is achieved by creating several versions for a video sequence
using different encoding parameters, such as resolution.
This, however, requires a huge amount of encodings, which impacts on time, cost
and energy (increased CO2 footprint).
11%
17%
11%
6%20%
16%
19%
Distribution of energy
consumption for production
and use in 2017
TVs (production)
Computers
(production)
Smartphones
(production)
Others
Terminals (use)
Networks (use)
Data Centers (use)
“…as of the end of December last year,
the maximum number of daily meeting
participants, both free and paid,
conducted on Zoom was approximately
10 million. In March this year, we reached
more than 200 million daily meeting
participants, both free and paid.” [2]
Eric S. Yuan 
Founder and CEO, Zoom
I. Motivation
Fig.1 Sample frames of a 100 4K dataset.
101
102
103
104
105
106
Bitrate (kbps)
25
30
35
40
45
50
55
PSNR(dB)
4K
RQsFHD
RQs
HD
RQs
Fig.2 PSNR-log(Rate) curves across resolutions.
One ladder
does not fit all!
Table 1 The encoding ladder presented in Apple Tech Note TN2224.
I. Motivation
How can we find the “best” bitrate ladder per content so that we do not compromise the quality of
experience?
How could we make this process more computationally efficient without degrading the delivered
video quality?
Table 1 The encoding ladder presented in Apple Tech Note TN2224.
Table 2 Netflix’s per-title can change both the
number of rungs and their resolution. [3, 4]
Other Per-Title Approaches: Bitmovin, Mux, CAMBRIA, etc
I. Motivation
How can we find the “best” bitrate ladder per content so that we do
not compromise the quality of experience?
How could we make this process more computationally efficient
without degrading the delivered video quality? Convex Hull-
Optimal Encoding
Solution
Sub-optimal
Encoding Solution
Sub-optimal
Encoding Solution
Practical
Approach
Fig.3 RD curves and convex hull.
Ideally the optimal solution would to build the ladder by sampling
the convex hull of the RQ curves across resolutions.
We propose a content-gnostic machine-
learning based approach that predicts the
bitrate ladder.
II. Content Features and Compression
Fig.4 Correlation matrix of HM coding statistics to
spatio-temporal features. [5]
Fig.5 Examples of predicted PSNR-Rate curves. [5]
III. Proposed Framework
101
102
103
104
105
106
Bitrate (kbps)
25
30
35
40
45
50
55
PSNR(dB)
4K
RQsFHD
RQs
HD
RQs
Fig.2 PSNR-log(Rate) curves across resolutions.
5000 10000 50000
log (Bitrate (kbps))
32
34
36
38
40
PSNR(dB)
4K
FHD
HD
Convex Hull
{QP
high
FHD
,QP
HD
}
{QP
4K
,QP
low
FHD
}
Fig.6 Example of RQ curves’ intersection.
Finding the cross-over points helps defining
the switching of resolution on the convex hull.
We assume that the RQs are intersecting in an ordered monotonic fashion (e.g. 2160p intersects with the
1080p, 1080p with the 720p, etc).
III. Proposed Framework
Fig.7 Scatterplots of cross-over QPs.
15 20 25 30 35 40 45
QP
4K
15
20
25
30
35
40
45
QP
low
FHD
PCC: .9917
SROCC: .9888
20 25 30 35 40
QPhigh
FHD
20
25
30
35
40
QP
HD
PCC: .9817
SROCC: .9538
This relation can be used to improve cross-
over QP predictions.
III. Proposed Framework
Content
Features
Extraction
Machine
Learning-based
Regression
Testing Videos @
Native Spatial
Resolution
Spatio-temporal
Features of
Testing Videos
Video
CodecBitrate of
Cross-over
Points
RQ Convex Hull
Fitting
Ground-truth -
RQ Convex Hull
Training Videos @
Native Resolution
Downscaling
Resolution
Training Videos @ all considered
resolutions
Training
Videos Cross-
over QPs
Training Videos @
Native Resolution
Spatio-temporal Features of Training
Videos
Training Process
Testing Process
Upscaling
Resolution
Decoded Training
Videos @ all
considered
resolutions
Upscaled Training Videos
@ Native Resolution
Quality
Metrics
Computation
Upscaled
Decoded Training
Videos @ Native
Resolution
Decoded Testing
Videos
@ Cross-over QPs
Upscaled Decoded
Testing Videos @ Cross-over
QPs
Quality Metric Values for
Training Videos
Quality Metric Values for Testing
Videos at Cross-over Points
Testing Videos @ Native Spatial
Resolution
Predicted Cross-
over QPs per
Resolution
Predicted
BitrateLadder • RQ Convex Hull
Eq.
• Rate-QP Eq.
• Resolution
Switching Rate
points
Fig.8 Proposed method.
III. Proposed Framework
Fig.9 RQ convex hulls (blue: 2160p, red: 1080p, yellow: 720p. purple: 540p, green: 480p).
III. Proposed Framework
We fitted the convex hull in a 3rd order polynomial.
This means that after determining the cross-over QPs, we need four encodes in order to determine
the polynomial parameters.
Then, we can sample the convex hull and build the bitrate ladder.
Table 3 Fitted Models.
III. Proposed Framework
17 18 19 20 21 22 23 24 25
log2(Bitrate)
20
30
40
50
60
70
80
90
100
VMAF
17 18 19 20 21 22 23 24 25
log2(BitRate)
20
25
30
35
40
45
50
55
PSNR(dB)
Fig.10 PSNR-Rate Ladder Fig.11 VMAF-Rate Ladder
RL,i ≃ 2RL,i−1 or log(RL,i) ≃ 1 + log(RL,i−1) , where RL,i ∈ (Rmin, Rmax)
QL,i(RL,i) ≤ Qmax and
dQL,i
RL
> ϵ , where ϵ → 0
Building the bitrate ladder:
1. Determine the operational bitrate range;
2. Sample the bitrate:
3. Sample the quality:
IV. Results
0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
stdTC
std
20
25
30
35
40
45
QP
4K
Table 4 List of Features [5].
Fig.12 Example of content dependency of cross-over QPs.
IV. Results
Fig.1 Sample frames of a 100 4K dataset. Fig.13 Spatial and Temporal Information of the dataset.
IV. Results
From PCS2019 paper[6]: We have tested the proposed framework with HM16.20, considering the resolutions
{2160p,1080p,720p}.
Lanczos-3 filter (ffmpeg implementation) was used for the spatial down/up-sampling.
We compare our method against two state-of-the-art solutions:
• Brute force method: we performed encodings with a QP step equal to 1. The brute force method theoretically
creates the optimal convex hull. This is considered our ground truth.
• Interpolation-based method: 7 encodings per resolution (using equidistant QPs to cover the range) and by
using a piece-wise cubic Hermite interpolation for the in-between QPs. This method of course results in
constructing a suboptimal convex hull, but it can provide a good approximation of it, while significantly
reducing the number of pre-encodes.
IV. Results
We applied feature selection, and particularly Recursive Feature Elimination on the set of spatio-temporal
features.
We perform a sequential prediction of the QPs starting from the higher resolution:
• For the QP4K prediction, we only relied on spatio-temporal features.
• For the rest of the predictions, we made use of the identified relations and considered the previously predicted QPs (of the
highest resolutions) as features.
We have tested various regression methods, such as SVMs with different kernels, RFs, etc, but GPs were the best
performing models.
To avoid overfitting, we performed a 10-fold cross-validation.
IV. Results
15 20 25 30 35 40
True QP4K
15
20
25
30
35
40
PredictedQP4K
20 25 30 35 40
True QPhigh
FHD
20
22
24
26
28
30
32
34
36
38
40
PredictedQPhigh
FHD
14 16 18 20 22 24 26 28 30 32 34 36
True QPlow
HD
14
16
18
20
22
24
26
28
30
32
34
36
PredictedQPlow
HD
Fig.14 Predicted cross QP 4K
Fig.15 Predicted cross QP FHD high.
Fig.16 Predicted cross QP HD
Table 5 Results on cross-over QPs prediction.
IV. Results
The different distributions
are due to the different
reference convex hulls.
Fig.17 BDRate Histogram. Fig.18 BDPSNR Histogram.
Most outliers refer to sequences
that do not comply with the
hypothesis that the RQs are
intersecting in a resolution-
monotonic manner.
IV. Results
0 5 10 15
Bitrate (kbps) 104
31
32
33
34
35
36
37
38
39
PSNR(dB)
4K - 2160p
FHD - 1080p
HD - 720p
Convex Hull
0 0.5 1 1.5 2
Bitrate (kbps) 105
30
32
34
36
38
40
PSNR(dB)
campfirepartyg
op1 - BDRate:0.18364 , BDPSNR:-0.0040022
Ground Truth Convex Hull
Predicted Convex Hull
BDRate=0.18%
BDPSNR=-0.004dB
0 1 2 3 4 5
Bitrate (kbps) 10
4
37
37.5
38
38.5
39
39.5
40
PSNR(dB)
4K - 2160p
HD - 1080p
SD - 720p
Convex Hull
0 1 2 3 4 5 6 7
Bitrate (kbps) 10
4
37.5
38
38.5
39
39.5
40
PSNR(dB)
barsceneg
op1 - BDRate:2.0087 , BDPSNR:-0.0091561
Ground Truth Convex Hull
Predicted Convex Hull
BDRate=2.009%
BDPSNR=-0.009dB
Fig.19 Examples of results.
IV. Results
94.2% fewer encodings compared to the brute
force method and 80.95% compared to the
interpolation-based method.
Proposed method overhead: the average feature
extraction time for a sequence at 4K resolution to
the average 4K encoding time for a sequence at
QP=27 is 0.18.
Table 6 Comparison of the number of encodes required per method.
V. Conclusion and Future Work
Conclusions:
We proposed a method that can predict the bitrate ladders of the considered resolutions based on spatio-temporal
features extracted from the uncompressed videos at their native resolution and with a few video encodings (two
encodes per RQ intersecting points).
The first results are promising compared to the ground truth, while requiring 94.2% and 81% fewer pre-encodes
compared to the brute force method and the interpolation- based method, respectively.
Future Work:
Our focus will be on validating the presented method across different codecs.
We will also work on identifying cross-codecs optimization of bitrate ladders.
References
1. “Global Mobile Data Traffic Forecast Update 2017-2022”, White Paper, Cisco, 2018.
2. E. S. Yuan, “A message to our users”, https://blog.zoom.us/a-message-to-our-users/
3. J. De Cock, Z. Li, M. Manohara, and A. Aaron, “Complexity-based consistent quality encoding in the Cloud”, IEEE ICIP 2016.
4. J. Sole, L. Guo, A. Norkin, M. Afonso, K. Swanson, and A. Aaron, “Performance comparison of video coding standards: an

adaptive streaming perspective,” https://medium.com/netflix-techblog/performance- comparison- of- video- coding- standards- an- adaptive- streaming-
perspective- d45d0183ca95, 2018.
5.A. Katsenou, M. Afonso, D. Agrafiotis, and D. R. Bull, “Predicting Video Rate-Distortion Curves using Textural Features,” in PCS 2016.
6. A. V. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in PCS 2019.
Thanks to
Dr Joel Sole
Dr Mariana Afonso
Prof David Bull
pcs2021.org
You are invited!
1 of 25

Recommended

A Game of Chess is Like a Swordfight.pdf by
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfFörderverein Technische Fakultät
528 views9 slides
From Mind to Meta.pdf by
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdfFörderverein Technische Fakultät
403 views20 slides
Miniatures Design for Tabletop Games.pdf by
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfFörderverein Technische Fakultät
521 views22 slides
Distributed Systems in the Post-Moore Era.pptx by
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxFörderverein Technische Fakultät
388 views36 slides
Don't Treat the Symptom, Find the Cause!.pptx by
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxFörderverein Technische Fakultät
63 views84 slides
Engineering Serverless Workflow Applications in Federated FaaS.pdf by
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfFörderverein Technische Fakultät
150 views104 slides

More Related Content

More from Förderverein Technische Fakultät

Towards a data driven identification of teaching patterns.pdf by
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfFörderverein Technische Fakultät
79 views22 slides
Förderverein Technische Fakultät.pptx by
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät
139 views1 slide
The Computing Continuum.pdf by
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdfFörderverein Technische Fakultät
647 views42 slides
East-west oriented photovoltaic power systems: model, benefits and technical ... by
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...Förderverein Technische Fakultät
599 views13 slides
Machine Learning in Finance via Randomization by
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationFörderverein Technische Fakultät
136 views50 slides
IT does not stop by
IT does not stopIT does not stop
IT does not stopFörderverein Technische Fakultät
326 views28 slides

More from Förderverein Technische Fakultät(20)

Recently uploaded

VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueShapeBlue
85 views54 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
58 views21 slides
PharoJS - Zürich Smalltalk Group Meetup November 2023 by
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023Noury Bouraqadi
141 views17 slides
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Moses Kemibaro
29 views38 slides
DRBD Deep Dive - Philipp Reisner - LINBIT by
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITShapeBlue
62 views21 slides
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
46 views13 slides

Recently uploaded(20)

VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue85 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi141 views
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by Moses Kemibaro
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Moses Kemibaro29 views
DRBD Deep Dive - Philipp Reisner - LINBIT by ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue62 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue46 views
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... by ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue48 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue74 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays40 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue46 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue65 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue46 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue57 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu141 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue54 views
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue119 views
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... by ShapeBlue
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
ShapeBlue83 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online by ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue102 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue96 views

Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming

  • 1. Content-gnostic Bitrate Ladder Prediction forAdaptive Video Streaming 29/09/2020 Angeliki Katsenou
  • 2. Structure I. Motivation II. Compression and Content III. Proposed Framework IV. Results V.Conclusion and Future Work
  • 3. I. Motivation Cisco reports on internet data traffic estimate that the video data share is expected to reach 80% by 2022 and is expected to increase more. [1] Due to the pandemic and recently shift towards remote work life-style, this figure is probably almost a reality. Video providers employ adaptive streaming to address the users specifications. Traditionally, this is achieved by creating several versions for a video sequence using different encoding parameters, such as resolution. This, however, requires a huge amount of encodings, which impacts on time, cost and energy (increased CO2 footprint). 11% 17% 11% 6%20% 16% 19% Distribution of energy consumption for production and use in 2017 TVs (production) Computers (production) Smartphones (production) Others Terminals (use) Networks (use) Data Centers (use) “…as of the end of December last year, the maximum number of daily meeting participants, both free and paid, conducted on Zoom was approximately 10 million. In March this year, we reached more than 200 million daily meeting participants, both free and paid.” [2] Eric S. Yuan  Founder and CEO, Zoom
  • 4. I. Motivation Fig.1 Sample frames of a 100 4K dataset. 101 102 103 104 105 106 Bitrate (kbps) 25 30 35 40 45 50 55 PSNR(dB) 4K RQsFHD RQs HD RQs Fig.2 PSNR-log(Rate) curves across resolutions. One ladder does not fit all! Table 1 The encoding ladder presented in Apple Tech Note TN2224.
  • 5. I. Motivation How can we find the “best” bitrate ladder per content so that we do not compromise the quality of experience? How could we make this process more computationally efficient without degrading the delivered video quality? Table 1 The encoding ladder presented in Apple Tech Note TN2224. Table 2 Netflix’s per-title can change both the number of rungs and their resolution. [3, 4] Other Per-Title Approaches: Bitmovin, Mux, CAMBRIA, etc
  • 6. I. Motivation How can we find the “best” bitrate ladder per content so that we do not compromise the quality of experience? How could we make this process more computationally efficient without degrading the delivered video quality? Convex Hull- Optimal Encoding Solution Sub-optimal Encoding Solution Sub-optimal Encoding Solution Practical Approach Fig.3 RD curves and convex hull. Ideally the optimal solution would to build the ladder by sampling the convex hull of the RQ curves across resolutions. We propose a content-gnostic machine- learning based approach that predicts the bitrate ladder.
  • 7. II. Content Features and Compression Fig.4 Correlation matrix of HM coding statistics to spatio-temporal features. [5] Fig.5 Examples of predicted PSNR-Rate curves. [5]
  • 8. III. Proposed Framework 101 102 103 104 105 106 Bitrate (kbps) 25 30 35 40 45 50 55 PSNR(dB) 4K RQsFHD RQs HD RQs Fig.2 PSNR-log(Rate) curves across resolutions. 5000 10000 50000 log (Bitrate (kbps)) 32 34 36 38 40 PSNR(dB) 4K FHD HD Convex Hull {QP high FHD ,QP HD } {QP 4K ,QP low FHD } Fig.6 Example of RQ curves’ intersection. Finding the cross-over points helps defining the switching of resolution on the convex hull. We assume that the RQs are intersecting in an ordered monotonic fashion (e.g. 2160p intersects with the 1080p, 1080p with the 720p, etc).
  • 9. III. Proposed Framework Fig.7 Scatterplots of cross-over QPs. 15 20 25 30 35 40 45 QP 4K 15 20 25 30 35 40 45 QP low FHD PCC: .9917 SROCC: .9888 20 25 30 35 40 QPhigh FHD 20 25 30 35 40 QP HD PCC: .9817 SROCC: .9538 This relation can be used to improve cross- over QP predictions.
  • 10. III. Proposed Framework Content Features Extraction Machine Learning-based Regression Testing Videos @ Native Spatial Resolution Spatio-temporal Features of Testing Videos Video CodecBitrate of Cross-over Points RQ Convex Hull Fitting Ground-truth - RQ Convex Hull Training Videos @ Native Resolution Downscaling Resolution Training Videos @ all considered resolutions Training Videos Cross- over QPs Training Videos @ Native Resolution Spatio-temporal Features of Training Videos Training Process Testing Process Upscaling Resolution Decoded Training Videos @ all considered resolutions Upscaled Training Videos @ Native Resolution Quality Metrics Computation Upscaled Decoded Training Videos @ Native Resolution Decoded Testing Videos @ Cross-over QPs Upscaled Decoded Testing Videos @ Cross-over QPs Quality Metric Values for Training Videos Quality Metric Values for Testing Videos at Cross-over Points Testing Videos @ Native Spatial Resolution Predicted Cross- over QPs per Resolution Predicted BitrateLadder • RQ Convex Hull Eq. • Rate-QP Eq. • Resolution Switching Rate points Fig.8 Proposed method.
  • 11. III. Proposed Framework Fig.9 RQ convex hulls (blue: 2160p, red: 1080p, yellow: 720p. purple: 540p, green: 480p).
  • 12. III. Proposed Framework We fitted the convex hull in a 3rd order polynomial. This means that after determining the cross-over QPs, we need four encodes in order to determine the polynomial parameters. Then, we can sample the convex hull and build the bitrate ladder. Table 3 Fitted Models.
  • 13. III. Proposed Framework 17 18 19 20 21 22 23 24 25 log2(Bitrate) 20 30 40 50 60 70 80 90 100 VMAF 17 18 19 20 21 22 23 24 25 log2(BitRate) 20 25 30 35 40 45 50 55 PSNR(dB) Fig.10 PSNR-Rate Ladder Fig.11 VMAF-Rate Ladder RL,i ≃ 2RL,i−1 or log(RL,i) ≃ 1 + log(RL,i−1) , where RL,i ∈ (Rmin, Rmax) QL,i(RL,i) ≤ Qmax and dQL,i RL > ϵ , where ϵ → 0 Building the bitrate ladder: 1. Determine the operational bitrate range; 2. Sample the bitrate: 3. Sample the quality:
  • 14. IV. Results 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 stdTC std 20 25 30 35 40 45 QP 4K Table 4 List of Features [5]. Fig.12 Example of content dependency of cross-over QPs.
  • 15. IV. Results Fig.1 Sample frames of a 100 4K dataset. Fig.13 Spatial and Temporal Information of the dataset.
  • 16. IV. Results From PCS2019 paper[6]: We have tested the proposed framework with HM16.20, considering the resolutions {2160p,1080p,720p}. Lanczos-3 filter (ffmpeg implementation) was used for the spatial down/up-sampling. We compare our method against two state-of-the-art solutions: • Brute force method: we performed encodings with a QP step equal to 1. The brute force method theoretically creates the optimal convex hull. This is considered our ground truth. • Interpolation-based method: 7 encodings per resolution (using equidistant QPs to cover the range) and by using a piece-wise cubic Hermite interpolation for the in-between QPs. This method of course results in constructing a suboptimal convex hull, but it can provide a good approximation of it, while significantly reducing the number of pre-encodes.
  • 17. IV. Results We applied feature selection, and particularly Recursive Feature Elimination on the set of spatio-temporal features. We perform a sequential prediction of the QPs starting from the higher resolution: • For the QP4K prediction, we only relied on spatio-temporal features. • For the rest of the predictions, we made use of the identified relations and considered the previously predicted QPs (of the highest resolutions) as features. We have tested various regression methods, such as SVMs with different kernels, RFs, etc, but GPs were the best performing models. To avoid overfitting, we performed a 10-fold cross-validation.
  • 18. IV. Results 15 20 25 30 35 40 True QP4K 15 20 25 30 35 40 PredictedQP4K 20 25 30 35 40 True QPhigh FHD 20 22 24 26 28 30 32 34 36 38 40 PredictedQPhigh FHD 14 16 18 20 22 24 26 28 30 32 34 36 True QPlow HD 14 16 18 20 22 24 26 28 30 32 34 36 PredictedQPlow HD Fig.14 Predicted cross QP 4K Fig.15 Predicted cross QP FHD high. Fig.16 Predicted cross QP HD Table 5 Results on cross-over QPs prediction.
  • 19. IV. Results The different distributions are due to the different reference convex hulls. Fig.17 BDRate Histogram. Fig.18 BDPSNR Histogram. Most outliers refer to sequences that do not comply with the hypothesis that the RQs are intersecting in a resolution- monotonic manner.
  • 20. IV. Results 0 5 10 15 Bitrate (kbps) 104 31 32 33 34 35 36 37 38 39 PSNR(dB) 4K - 2160p FHD - 1080p HD - 720p Convex Hull 0 0.5 1 1.5 2 Bitrate (kbps) 105 30 32 34 36 38 40 PSNR(dB) campfirepartyg op1 - BDRate:0.18364 , BDPSNR:-0.0040022 Ground Truth Convex Hull Predicted Convex Hull BDRate=0.18% BDPSNR=-0.004dB 0 1 2 3 4 5 Bitrate (kbps) 10 4 37 37.5 38 38.5 39 39.5 40 PSNR(dB) 4K - 2160p HD - 1080p SD - 720p Convex Hull 0 1 2 3 4 5 6 7 Bitrate (kbps) 10 4 37.5 38 38.5 39 39.5 40 PSNR(dB) barsceneg op1 - BDRate:2.0087 , BDPSNR:-0.0091561 Ground Truth Convex Hull Predicted Convex Hull BDRate=2.009% BDPSNR=-0.009dB Fig.19 Examples of results.
  • 21. IV. Results 94.2% fewer encodings compared to the brute force method and 80.95% compared to the interpolation-based method. Proposed method overhead: the average feature extraction time for a sequence at 4K resolution to the average 4K encoding time for a sequence at QP=27 is 0.18. Table 6 Comparison of the number of encodes required per method.
  • 22. V. Conclusion and Future Work Conclusions: We proposed a method that can predict the bitrate ladders of the considered resolutions based on spatio-temporal features extracted from the uncompressed videos at their native resolution and with a few video encodings (two encodes per RQ intersecting points). The first results are promising compared to the ground truth, while requiring 94.2% and 81% fewer pre-encodes compared to the brute force method and the interpolation- based method, respectively. Future Work: Our focus will be on validating the presented method across different codecs. We will also work on identifying cross-codecs optimization of bitrate ladders.
  • 23. References 1. “Global Mobile Data Traffic Forecast Update 2017-2022”, White Paper, Cisco, 2018. 2. E. S. Yuan, “A message to our users”, https://blog.zoom.us/a-message-to-our-users/ 3. J. De Cock, Z. Li, M. Manohara, and A. Aaron, “Complexity-based consistent quality encoding in the Cloud”, IEEE ICIP 2016. 4. J. Sole, L. Guo, A. Norkin, M. Afonso, K. Swanson, and A. Aaron, “Performance comparison of video coding standards: an
 adaptive streaming perspective,” https://medium.com/netflix-techblog/performance- comparison- of- video- coding- standards- an- adaptive- streaming- perspective- d45d0183ca95, 2018. 5.A. Katsenou, M. Afonso, D. Agrafiotis, and D. R. Bull, “Predicting Video Rate-Distortion Curves using Textural Features,” in PCS 2016. 6. A. V. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in PCS 2019.
  • 24. Thanks to Dr Joel Sole Dr Mariana Afonso Prof David Bull