Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences Using Artificial Neural Network


Published on

HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences Using Artificial Neural Network

  1. 1. ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences Using Artificial Neural Network Anatoliy Zabrovskiy, Prateek Agrawal, Roland Mathá, Christian Timmerer, Radu Prodan The Sixth IEEE International Conference on Multimedia Big Data September 24-26, 2020 New Delhi.
  2. 2. Motivation Current situation: ● Many video codecs (AVC, HEVC, VP9, AV1 and etc.) ● Transcoding time depends on many technical aspects (content complexity, transcoding parameters, processing units) ● Transcoding of video segments is a parallel process running on a high-performance infrastructure such as the cloud ❏ Transcoding services and platforms work without any prediction of the transcoding time ❏ Transcoding time prediction can significantly improve the overall transcoding time 2
  3. 3. ComplexCTTP method. Goal Goal: - Accurate transcoding time prediction for video sequences The approach is based on two phases: - Data generation - Transcoding time prediction using ANN 3
  4. 4. Dataset 2 codecs x 19 bitrates x 240 segments x 9 encoding presets = 82080 transcodings (580 hours) 4 Original video file characteristics Average spatial information (SI) and temporal information (TI) for original video sequences Transcoding with FFmpeg Raw transcoding dataset Video codecs: x264 and x265 Bitrates: Segments: 160 (2 sec), 80 (4sec) Encoding presets: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow Training/testing datasets We performed the transcoding on a Intel Xeon Gold 6148 2.4 GHz processor Contains the maximum and minimum transcoding time for all possible combinations of - codec type, - complexity class, - encoding bitrate, - encoding preset, - segment duration, - fps
  5. 5. Segment 2 Segment 1 Segment 3 Segment n Segment complexity classification and ANN Original video segments Segment 2 Segment 1 Segment 3 Segment n Segments with low resolution (144p) and bitrate Calculating SI and TI (per segment) Calculating Complexity class (per segment) 5 Encoding segments to low bitrate and resolution The correlation coefficient between encoded video segments with 144p resolution and the original video segments with 2160p resolution presents positively strong (0.98 for TI) and highly correlated (0.65 for SI) relationship.
  6. 6. Results and analysis. Actual transcoding time Average actual transcoding time for all segments belonging to a particular complexity class - x265 requires more more computing resources than x264 - Segment transcoding time depends on the complexity of the content - The transcoding time increases with the complexity class of the content - Сomplexity class significantly describes the complexity of the video segments in terms of the time required for transcoding. 6
  7. 7. Results and analysis. ANNs Based on the results: - ANN with complexity class (1) input parameter predicts transcoding time better compared to the (2) ANN model without any complexity class, TI and SI input parameters or ANN model with TI and SI (3) of segments transcoded with a low resolution and bitrate. - The ANN model with TI and SI input parameters of original video segments (4) has slightly better MAE for both x264 and x265 codec compared to our ANN model with complexity class. Unfortunately the calculation of TI and SI metric for the original video segments with high bitrate and resolution takes more time. 7
  8. 8. Results and analysis. ComplexCTTP method vs Tewodros et al. OVCTT dataset: - Less transcodings - Lower maximum values and standard deviations of transcoding times - Includes transcodings for MPEG-4 Part 2, VP8 and H.263 codec. 8 OVCTT - Online Video Characteristics and Transcoding Time Dataset ComplexCTTP dataset outperforms OVCTT dataset for almost all characteristics! Transcoding time characteristics of both datasets Transcoding parameters characteristics of both datasets
  9. 9. Results and analysis 9 The Tewodros et al. use bitrate, framerate, resolution, codec, number and size of I, P, B frames as input parameters for their ANN model. The average time (in sec) required to calculate ANN input parameters for one beauty video 2s segment using both the methods. Percentage decrease of time (PDT) for 2 sec. segments for Beauty video sequence is about 70% Percentage decrease of time for all ten video sequences with 4s segments. PDT values range from 53% to 80%
  10. 10. Results and analysis 10 With our ComplexCTTP method, we were able to minimize MAE to 1.37 for AVC/x264 which is an improvement of approximately 22% as compared to the Tewodros et al. method (MAE 1.76). The result shows that ComplexCTTP performs better in terms of prediction accuracy. Coefficients of determination for both the methods
  11. 11. Conclusions ● We proposed video complexity classification, with respect to the video segment’s spatial and temporal information 11 ● We introduced a fast approach to measure SI and TI ● The developed ANN model is able to predict the video transcoding time with low mean absolute error.
  12. 12. Future work ● Experiments on new emerging codecs ● Using the predicted transcoding time for the actual scheduling of video transcoding tasks ● Intelligently selecting and analyzing the content complexity of a few segments of a video to make prediction about the transcoding time of the entire video 12
  13. 13. Thank you! 13 Anatoliy Zabrovskiy