The 131st WG 11 (MPEG) meeting was held online, 29 June – 3 July 2020
Table of Contents
WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard
Point Cloud Compression – WG11 (MPEG) promotes a Video-based Point Cloud Compression Technology to the FDIS stage
MPEG-H 3D Audio – WG11 (MPEG) promotes Baseline Profile for 3D Audio to final stage
Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
WG11 (MPEG) issues a Call for Proposals on extension and improvements to ISO/IEC 23092 standard series
Widening support for storage and delivery of MPEG-5 EVC
Multi-Image Application Format adds support of HDR
Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
MPEG Immersive Video (MIV) progresses to Committee Draft
Neural Network Compression for Multimedia Applications – WG11 (MPEG) progresses to Committee Draft
WG11 (MPEG) issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)
A Distributed Delivery Architecture for User Generated Content Live Streaming...Alpen-Adria-Universität
Live User Generated Content (UGC) has become very popular in today’s video streaming applications, in particular with gaming and e-sport. However, streaming UGC presents unique challenges for video delivery. When dealing with the technical complexity of managing hundreds or thousands of concurrent streams that are geographically distributed, UGCsystems are forces to made difficult trade-offs with video quality and latency. To bridge this gap, this paper presents a fully distributed architecture for UGC delivery over the Internet, termed QuaLA(joint Quality-Latency Architecture). The proposed architecture aims to jointly optimize video quality and latency for a better user experience and fairness. By using the proximal Jacobi alternating direction method of multipliers(ProxJ-ADMM) technique, QuaLA proposes a fully distributed mechanism to achieve an optimal solution. We demonstrate the effectiveness of the proposed architecture through real-world experiments using the CloudLAB testbed. Experimental results show the outperformance ofQuaLAin achieving high quality with more than 57% improvement while preserving a good level of fairness and respecting a given target latency among all clients compared to conventional client-driven solutions
With the recent surge in Internet multimedia traffic, the enhancement and improvement of media players, specifically DASH media players happened at an incredible rate. DASH Media players take advantage of adapting a media stream to the network fluctuations by continuously monitoring the network and making decisions in near real-time. The performance of algorithms that are in charge of making such decisions was often difficult to be evaluated and objectively assessed.
CAdViSE provides a Cloud-based Adaptive Video Streaming Evaluation framework for the automated testing of adaptive media players. In this talk, I will introduce the CAdViSE framework, its application, and propose the benefits and advantages that it can bring to every web-based media player development pipeline. To demonstrate the power of CAdViSE in evaluating Adaptive Bitrate (ABR) algorithms I will exhibit its capabilities when combined with objective Quality of Experience (QoE) models. For this talk, my team at Bitmovin/ATHENA has selected the ITU-T P.1203 (mode 1) model in order to execute experiments and calculate the Mean Opinion Score (MOS), and better understand the behavior of a set of well-known ABR algorithms in a real-life setting. The talk will display how we tested and deployed our framework using a modular architecture into a cloud infrastructure. This method yields a massive growth to the number of concurrent experiments and the number of media players that can be evaluated and compared at the same time, thus enabling maximum potential scalability. In my team’s most recent experiments, we used Amazon Web Services (AWS) for demonstration purposes. Another awesome feature of CAdViSE that will be discussed here is the ability to shape the test network with endless network profiles. To do so, we used a fluctuation network profile and a real LTE network trace based on the recorded internet usage of a bicycle commuter in Belgium.
CAdViSE produces comprehensive logs for each media streaming experimental session. These logs can then be applied against different goals, such as objective evaluation to stitch back media segments and conduct subjective evaluations afterwards. In addition, startup delays, stall events, and other media streaming defects can be imitated exactly as they happened during the experimental streaming sessions.
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
Video traffic comprises the majority of today’s Internet traffic, and HTTP Adaptive Streaming (HAS) is the preferred method to deliver video content over the Internet. The increasing demand for video and the improvements in the video display conditions over the years caused an increase in video coding complexity. This increased complexity brought the need for more efficient video streaming and coding solutions. The latest standard video codecs can reduce the size of the videos by using more efficient tools with higher time complexities. The plans for integrating machine learning into upcoming video codecs raised the interest in applied machine learning for video coding. In this doctoral study, we aim to propose applied machine learning methods to video coding, focusing on HTTP adaptive streaming. We present four primary research questions to target different challenges in video coding for HTTP adaptive streaming.
A Channel Allocation Algorithm for Cognitive Radio Users Based on Channel Sta...Alpen-Adria-Universität
Cognitive radio networks by utilizing the spectrum holes in licensed frequency bands are able to efficiently manage the radio spectrum. A significant improvement in spectrum use can be achieved by giving secondary users access to these spectrum holes. Predicting spectrum holes can save significant energy that is consumed to detect spectrum holes. This is because the secondary users can only select the channels that are predicted to be idle channels. However, collisions can occur either between a primary user and secondary users or among the secondary users themselves. This paper introduces a centralized channel allocation algorithm in a scenario with multiple secondary users to control both primary and secondary collisions. The proposed allocation algorithm, which uses a channel status predictor, provides a good performance with fairness among the secondary users while they have the minimal interference with the primary user. The simulation results show that the probability of a wrong prediction of an idle channel state in a multi-channel system is less than 0.9%. In addition, the channel state prediction saves the sensing energy up to 73%, and the utilization of the spectrum can be improved more than 77%.
In this contribution, we present selected novel approaches and results of our research work in the \ATHENA Christian Doppler Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services), a major research project at our department jointly funded by public sources and industry. By putting this work also into the context of related ongoing research activities, we aim at working out where HTTP Adaptive Streaming is currently heading.
On Optimizing Resource Utilization in AVC-based Real-time Video StreamingAlpen-Adria-Universität
Real-time video streaming traffic and related applications have witnessed significant growth in recent years. However, this has been accompanied by some challenging issues, predominantly resource utilization. IP multicasting, as a solution to this problem, suffers from many problems. Using scalable video coding could not gain wide adoption in the industry, due to reduced compression efficiency and additional computational complexity. The emerging software-defined networking (SDN)and network function virtualization (NFV) paradigms enable re-searchers to cope with IP multicasting issues in novel ways. In this paper, by leveraging the SDN and NFV concepts, we introduce a cost-aware approach to provide advanced video coding (AVC)-based real-time video streaming services in the network. In this study, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder (VTF)functions. At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller. Then, executing a mixed-integer linear program (MILP) determines an optimal multicast tree from an appropriate set of video source servers to the optimal group of transcoders. The desired video is sent over the multicast tree. The VTFs transcode the received video segments and stream to the requested VRPs over unicast paths. To mitigate the time complexity of the proposed MILPmodel, we propose a heuristic algorithm that determines a near-optimal solution in a reasonable amount of time. Using theMiniNet emulator, we evaluate the proposed approach and show it achieves better performance in terms of cost and resource utilization in comparison with traditional multicast and unicast approaches.
A Distributed Delivery Architecture for User Generated Content Live Streaming...Alpen-Adria-Universität
Live User Generated Content (UGC) has become very popular in today’s video streaming applications, in particular with gaming and e-sport. However, streaming UGC presents unique challenges for video delivery. When dealing with the technical complexity of managing hundreds or thousands of concurrent streams that are geographically distributed, UGCsystems are forces to made difficult trade-offs with video quality and latency. To bridge this gap, this paper presents a fully distributed architecture for UGC delivery over the Internet, termed QuaLA(joint Quality-Latency Architecture). The proposed architecture aims to jointly optimize video quality and latency for a better user experience and fairness. By using the proximal Jacobi alternating direction method of multipliers(ProxJ-ADMM) technique, QuaLA proposes a fully distributed mechanism to achieve an optimal solution. We demonstrate the effectiveness of the proposed architecture through real-world experiments using the CloudLAB testbed. Experimental results show the outperformance ofQuaLAin achieving high quality with more than 57% improvement while preserving a good level of fairness and respecting a given target latency among all clients compared to conventional client-driven solutions
With the recent surge in Internet multimedia traffic, the enhancement and improvement of media players, specifically DASH media players happened at an incredible rate. DASH Media players take advantage of adapting a media stream to the network fluctuations by continuously monitoring the network and making decisions in near real-time. The performance of algorithms that are in charge of making such decisions was often difficult to be evaluated and objectively assessed.
CAdViSE provides a Cloud-based Adaptive Video Streaming Evaluation framework for the automated testing of adaptive media players. In this talk, I will introduce the CAdViSE framework, its application, and propose the benefits and advantages that it can bring to every web-based media player development pipeline. To demonstrate the power of CAdViSE in evaluating Adaptive Bitrate (ABR) algorithms I will exhibit its capabilities when combined with objective Quality of Experience (QoE) models. For this talk, my team at Bitmovin/ATHENA has selected the ITU-T P.1203 (mode 1) model in order to execute experiments and calculate the Mean Opinion Score (MOS), and better understand the behavior of a set of well-known ABR algorithms in a real-life setting. The talk will display how we tested and deployed our framework using a modular architecture into a cloud infrastructure. This method yields a massive growth to the number of concurrent experiments and the number of media players that can be evaluated and compared at the same time, thus enabling maximum potential scalability. In my team’s most recent experiments, we used Amazon Web Services (AWS) for demonstration purposes. Another awesome feature of CAdViSE that will be discussed here is the ability to shape the test network with endless network profiles. To do so, we used a fluctuation network profile and a real LTE network trace based on the recorded internet usage of a bicycle commuter in Belgium.
CAdViSE produces comprehensive logs for each media streaming experimental session. These logs can then be applied against different goals, such as objective evaluation to stitch back media segments and conduct subjective evaluations afterwards. In addition, startup delays, stall events, and other media streaming defects can be imitated exactly as they happened during the experimental streaming sessions.
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
Video traffic comprises the majority of today’s Internet traffic, and HTTP Adaptive Streaming (HAS) is the preferred method to deliver video content over the Internet. The increasing demand for video and the improvements in the video display conditions over the years caused an increase in video coding complexity. This increased complexity brought the need for more efficient video streaming and coding solutions. The latest standard video codecs can reduce the size of the videos by using more efficient tools with higher time complexities. The plans for integrating machine learning into upcoming video codecs raised the interest in applied machine learning for video coding. In this doctoral study, we aim to propose applied machine learning methods to video coding, focusing on HTTP adaptive streaming. We present four primary research questions to target different challenges in video coding for HTTP adaptive streaming.
A Channel Allocation Algorithm for Cognitive Radio Users Based on Channel Sta...Alpen-Adria-Universität
Cognitive radio networks by utilizing the spectrum holes in licensed frequency bands are able to efficiently manage the radio spectrum. A significant improvement in spectrum use can be achieved by giving secondary users access to these spectrum holes. Predicting spectrum holes can save significant energy that is consumed to detect spectrum holes. This is because the secondary users can only select the channels that are predicted to be idle channels. However, collisions can occur either between a primary user and secondary users or among the secondary users themselves. This paper introduces a centralized channel allocation algorithm in a scenario with multiple secondary users to control both primary and secondary collisions. The proposed allocation algorithm, which uses a channel status predictor, provides a good performance with fairness among the secondary users while they have the minimal interference with the primary user. The simulation results show that the probability of a wrong prediction of an idle channel state in a multi-channel system is less than 0.9%. In addition, the channel state prediction saves the sensing energy up to 73%, and the utilization of the spectrum can be improved more than 77%.
In this contribution, we present selected novel approaches and results of our research work in the \ATHENA Christian Doppler Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services), a major research project at our department jointly funded by public sources and industry. By putting this work also into the context of related ongoing research activities, we aim at working out where HTTP Adaptive Streaming is currently heading.
On Optimizing Resource Utilization in AVC-based Real-time Video StreamingAlpen-Adria-Universität
Real-time video streaming traffic and related applications have witnessed significant growth in recent years. However, this has been accompanied by some challenging issues, predominantly resource utilization. IP multicasting, as a solution to this problem, suffers from many problems. Using scalable video coding could not gain wide adoption in the industry, due to reduced compression efficiency and additional computational complexity. The emerging software-defined networking (SDN)and network function virtualization (NFV) paradigms enable re-searchers to cope with IP multicasting issues in novel ways. In this paper, by leveraging the SDN and NFV concepts, we introduce a cost-aware approach to provide advanced video coding (AVC)-based real-time video streaming services in the network. In this study, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder (VTF)functions. At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller. Then, executing a mixed-integer linear program (MILP) determines an optimal multicast tree from an appropriate set of video source servers to the optimal group of transcoders. The desired video is sent over the multicast tree. The VTFs transcode the received video segments and stream to the requested VRPs over unicast paths. To mitigate the time complexity of the proposed MILPmodel, we propose a heuristic algorithm that determines a near-optimal solution in a reasonable amount of time. Using theMiniNet emulator, we evaluate the proposed approach and show it achieves better performance in terms of cost and resource utilization in comparison with traditional multicast and unicast approaches.
Vignesh V Menon and Hadi Amirpour gave a talk on ‘Video Complexity Analyzer for Streaming Applications’ at the Video Quality Experts Group (VQEG) meeting on December 14, 2021. Our research activities on video complexity analysis were presented in the talk.
Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC...Alpen-Adria-Universität
HTTP/2 has been explored widely for video streaming, but still suffers from Head-of-Line blocking and three-way hand-shake delay due to TCP. Meanwhile, QUIC running on top of UDP can tackle these issues. In addition, although many adaptive bitrate (ABR) algorithms have been proposed for scalable and non-scalable video streaming, the literature lacks an algorithm designed for both types of video streaming approaches. In this paper, we investigate the impact of quick and HTTP/2 on the performance of adaptive bitrate (ABR) algorithms in terms of different metrics. Moreover, we propose an efficient approach for utilizing scalable video coding formats for adaptive video streaming that combines a traditional video streaming approach (based on non-scalable video coding formats) and a retransmission technique. The experimental results show that QUIC benefits significantly from our proposed method in the context of packet loss and retransmission. Compared to HTTP/2, it improves the average video quality and also provides a smoother adaptation behavior. Finally, we demonstrate that our proposed method originally designed for non-scalable video codecs also works efficiently for scalable videos such as Scalable High EfficiencyVideo Coding (SHVC).
FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Le...Alpen-Adria-Universität
HTTP Adaptive Streaming(HAS) is the most common approach for delivering video content over the Internet. Therequirement to encode the same content at different quality levels(i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we propose to use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that FaME-ML achieves significant time-complexity savings in parallel encoding scenarios(41%in average) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingAlpen-Adria-Universität
Video delivery over the Internet has become more and more established in recent years due to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The current DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of periods, adaptation sets, representations, and segments. Although multi-period MPDs are widely used in live streaming scenarios, they are not fully utilized in Video-on-Demand (VoD) HTTP adaptive streaming (HAS) scenarios. In this paper, we introduce MiPSO, a framework for Multi-Period per-Scene optimization, to examine multiple periods in VoD HAS scenarios. MiPSO provides different encoded representations of a video at either (i) maximum possible quality or (ii) minimum possible bitrate, beneficial to both service providers and subscribers. In each period, the proposed framework adjusts the video representations (resolution-bitrate pairs) by taking into account the complexities of the video content, with the aim of achieving streams at either higher qualities or lower bitrates. The experimental evaluation with a test video data set shows that MiPSO reduces the average bitrate of streams with the same visual quality by approximately 10% or increases the visual quality of streams by at least 1 dB in terms of Peak Signal-to-Noise (PSNR) at the same bitrate compared to conventional approaches.
CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video StreamingAlpen-Adria-Universität
With the increasing demand for video streaming applications, HTTP Adaptive Streaming (HAS) technology has become the dominant video delivery technique over the Internet. Current HAS solutions only consider either client- or server-side optimization, which causes many problems in achieving high-quality video, leading to sub-optimal users’ experience and network resource utilization. Recent studies have revealed that network-assisted HAS techniques, by providing a comprehensive view of the network, can lead to more significant gains in HAS system performance. In this paper, we leverage the capability of Software-Define Networking (SDN), Network Function Virtualization (NFV), and edge computing to introduce a CDN-Aware QoE Optimization in SDN-Assisted Adaptive Video Streaming framework called CSDN. We employ virtualized edge entities to collect various information items (e.g., user-, client, CDN- and network-level information) in a time-slotted method. These components then run an optimization model with a new server/segment selection approach in a time-slotted fashion to serve the clients’ requests by selecting optimal cache servers (in terms of fetch and transcoding times). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, (ii) by a quality transcoded from an optimal replacement quality at the edge, or (iii) by the originally requested quality level from the origin server. By means of comprehensive experiments conducted on a real-world large-scale testbed, we demonstrate that CSDN outperforms the state-of-the-art in terms of playback bitrate, the number of quality switches, the number of stalls, and bandwidth usage by at least 7.5%, 19%, 19%, and 63%, respectively.
WISH: User-centric Bitrate Adaptation for HTTP Adaptive Streaming on Mobile D...Minh Nguyen
Recently, mobile devices have become paramount in online video streaming. Adaptive bitrate (ABR) algorithms of players responsible for selecting the quality of the videos face critical challenges in providing a high Quality of Experience (QoE) for end users. One open issue is how to ensure the optimal experience for heterogeneous devices in the context of extreme variation of mobile broadband networks. Additionally, end users may have different priorities on video quality and data usage (i.e., the amount of data downloaded to the devices through the mobile networks). A generic mechanism for players that enables specification of various policies to meet end users’ needs is still missing. In this paper, we propose a weighted sum model, namely WISH, that yields high QoE of the video and allows end users to express their preferences among different parameters (i.e., data usage, stall events, and video quality) of video streaming. WISH has been implemented into ExoPlayer, a popular player used in many mobile applications. The experimental results show that WISH improves the QoE by up to 17.6% while saving 36.4% of data usage compared to state-of-the-art ABR algorithms and provides dynamic adaptation to end users’ requirements.
High Efficiency Video Coding (HEVC) improves the encoding efficiency by utilizing sophisticated tools such as flexible Coding Tree Unit (CTU) partitioning. The Coding Unit (CU) can be split recursively into four equally sized CUs ranging from 64×64 to 8×8 pixels. At each depth level (or CU size), intra prediction via exhaustive mode search was exploited in HEVC to improve the encoding efficiency and result in a very high encoding time complexity. This paper proposes an Intra CU Depth Prediction (INCEPT) algorithm, which limits Rate-Distortion Optimization (RDO) for each CTU in HEVC by utilizing the spatial correlation with the neighboring CTUs, which is computed using a DCT energy-based feature. Thus, INCEPT reduces the number of candidate CU sizes required to be considered for each CTU in HEVC intra coding. Experimental results show that the INCEPT algorithm achieves a better trade-off between the encoding efficiency and encoding time saving (i.e., BDR/∆T) than the benchmark algorithms. While BDR/∆T is 12.35% and 9.03% for the benchmark algorithms, it is 5.49% for the proposed algorithm. As a result, INCEPT achieves a 23.34% reduction in encoding time on average while incurring only a 1.67% increase in bit rate than the original coding in the x265 HEVC open-source encoder.
Where to Encode: A Performance Analysis of Intel x86 and Arm-based Amazon EC2...Alpen-Adria-Universität
Video streaming became an undivided part of the Internet. To efficiently utilise the limited network bandwidth it is essential to encode the video content. However, encoding is a computationally intensive task, involving high-performance resources provided by private infrastructures or public clouds. Public clouds, such as Amazon EC2, provide a large portfolio of services and instances optimized for specific purposes and budgets. The majority of Amazon’s instances use x86 processors, such as Intel Xeon or AMD EPYC. However, following the recent trends in computer architecture, Amazon introduced Arm based instances that promise up to 40% better cost performance
ratio than comparable x86 instances for specific workloads. We evaluate in this paper the video encoding performance of x86 and Arm instances of four instance families using the latest FFmpeg version and two video codecs. We examine the impact of the encoding parameters, such as different presets and bitrates, on the time and cost for encoding. Our experiments reveal that Arm instances show high time and cost saving potential of up to
33.63% for specific bitrates and presets, especially for the x264 codec. However, the x86 instances are more general and achieve low encoding times, regardless of the codec.
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...Alpen-Adria-Universität
HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.
EPIQ'21: Days of Future Past: An Optimization-based Adaptive Bitrate Algorith...Minh Nguyen
HTTP Adaptive Streaming (HAS) has become a predominant technique for delivering videos in the Internet. Due to its adaptive behavior according to changing network conditions, it may result in video quality variations that negatively impact the Quality of Experience (QoE) of the user. In this paper, we propose Days of Future Past, an optimization- based Adaptive Bitrate (ABR) algorithm over HTTP/3. Days of Future Past takes advantage of an optimization model and HTTP/3 features, including (i) stream multiplexing and (ii) request cancellation. We design a Mixed Integer Linear Programming (MILP) model that determines the optimal video qualities of both the next segment to be requested and the segments currently located in the buffer. If better qualities for buffered segments are found, the client will send corresponding HTTP GET requests to retrieve them. Multiple segments (i.e., retransmitted segments) might be downloaded simultaneously to upgrade some buffered but not yet played segments to avoid quality decreases using the stream multiplexing feature of QUIC. HTTP/3’s request cancellation will be used in case retransmitted segments will arrive at the client after their playout time. The experimental results shows that our proposed method is able to improve the QoE by up to 33.9%.
HTTP adaptive streaming (HAS) with chunked transfer encoding can be used to reduce latency without sacrificing the coding ef- ficiency. While this allows a media segment to be generated and delivered at the same time, it also causes grossly inaccurate band- width measurements, leading to incorrect bitrate selections. To overcome this effect, we design a novel Adaptive bitrate scheme for Chunked Transfer Encoding (ACTE) that leverages the unique nature of chunk downloads. It uses a sliding window to accurately measure the available bandwidth and an online linear adaptive filter to predict the available bandwidth into the future. Results show that ACTE achieves 96% measurement accuracy, which translates to a 64% reduction in stalls and a 27% increase in video quality.
Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabilities, random access, and uniform quality distribution have not been addressed adequately. In this paper, an efficient light field image compression method based on a deep neural network is proposed, which classifies multiple views into various layers. In each layer, the target view is synthesized from the available views of previously encoded/decoded layers using a deep neural network. This synthesized view is then used as a virtual reference for the target view inter-coding. In this way, random access to an arbitrary view is provided. Moreover, uniform quality distribution among multiple views is addressed. In higher bitrates where random access to an arbitrary view is more crucial, the required bitrate to access the requested view is minimized.
ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video StreamingAlpen-Adria-Universität
Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the originally requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
Video traffic on the Internet is constantly growing; networked multimedia applications consume a predominant share of the available Internet bandwidth. A major technical breakthrough and enabler in multimedia systems research and of industrial networked multimedia services certainly was the HTTP Adaptive Streaming (HAS) technique. This resulted in the standardization of MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) which, together with HTTP Live Streaming (HLS), is widely used for multimedia delivery in today’s networks. Existing challenges in multimedia systems research deal with the trade-off between (i) the ever-increasing content complexity, (ii) various requirements with respect to time (most importantly, latency), and (iii) quality of experience (QoE). Optimizing towards one aspect usually negatively impacts at least one of the other two aspects if not both.
This situation sets the stage for our research work in the ATHENA Christian Doppler (CD) Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services; https://athena.itec.aau.at/), jointly funded by public sources and industry.
In this talk, we will present selected novel approaches and research results of the first year of the ATHENA CD Lab’s operation. We will highlight HAS-related research on: (i) multimedia content provisioning (machine learning for video encoding); (ii) multimedia content delivery (support of edge processing and virtualized network functions for video networking); (iii) multimedia content consumption and end-to-end aspects (player-triggered segment retransmissions to improve video playout quality); and (iv) novel QoE investigations (adaptive point cloud streaming). We will also put the work into the context of the international multimedia systems research.
Quality Optimization of Live Streaming Services over HTTP with Reinforcement ...Alpen-Adria-Universität
Recent years have seen tremendous growth in HTTP adaptive live video traffic over the Internet. In the presence of highly dynamic network conditions and diverse request patterns, existing yet simple hand-crafted heuristic approaches for serving client requests at the network edge might incur a large overhead and significant increase in time complexity. Therefore, these approaches might fail in delivering acceptable Quality of Experience (QoE) to end users. To bridge this gap, we propose ROPL, a learning-based client request management solution at the edge that leverages the power of the recent breakthroughs in deep reinforcement learning, to serve requests of concurrent users joining various HTTP-based live video channels. ROPL is able to react quickly to any changes in the environment, performing accurate decisions to serve clients requests, which results in achieving satisfactory user QoE. We validate the efficiency of ROPL through trace-driven simulations and a real-world setup. Experimental results from real-world scenarios confirm that ROPL outperforms existing heuristic-based approaches in terms of QoE, with a factor up to 3.7×.
CAdViSE: Cloud based Adaptive Video Streaming Evaluation Framework for the Au...Alpen-Adria-Universität
Attempting to cope with fluctuations of network conditions in terms of available bandwidth, latency and packet loss, and to deliver the highest quality of video (and audio) content to users, research on adaptive video streaming has attracted intense efforts from the research community and huge investments from technology giants. How successful these efforts and investments are, is a question that needs precise measurements of the results of those technological advancements. HTTP-based Adaptive Streaming (HAS) algorithms, which seek to improve video streaming over the Internet, introduce video bitrate adaptivity in a way that is scalable and efficient.
However, how each HAS implementation takes into account the wide spectrum of variables and configuration options, brings a high complexity to the task of measuring the results and visualizing the statistics of the performance and quality of experience.
In this paper, we introduce CAdViSE, our Cloud-based Adaptive
Video Streaming Evaluation framework for the automated testing
of adaptive media players. The paper aims to demonstrate a test
environment which can be instantiated in a cloud infrastructure,
examines multiple media players with different network attributes
at defined points of the experiment time, and finally concludes the
evaluation with visualized statistics and insights into the results.
Vignesh V Menon and Hadi Amirpour gave a talk on ‘Video Complexity Analyzer for Streaming Applications’ at the Video Quality Experts Group (VQEG) meeting on December 14, 2021. Our research activities on video complexity analysis were presented in the talk.
Scalable High Efficiency Video Coding based HTTP Adaptive Streaming over QUIC...Alpen-Adria-Universität
HTTP/2 has been explored widely for video streaming, but still suffers from Head-of-Line blocking and three-way hand-shake delay due to TCP. Meanwhile, QUIC running on top of UDP can tackle these issues. In addition, although many adaptive bitrate (ABR) algorithms have been proposed for scalable and non-scalable video streaming, the literature lacks an algorithm designed for both types of video streaming approaches. In this paper, we investigate the impact of quick and HTTP/2 on the performance of adaptive bitrate (ABR) algorithms in terms of different metrics. Moreover, we propose an efficient approach for utilizing scalable video coding formats for adaptive video streaming that combines a traditional video streaming approach (based on non-scalable video coding formats) and a retransmission technique. The experimental results show that QUIC benefits significantly from our proposed method in the context of packet loss and retransmission. Compared to HTTP/2, it improves the average video quality and also provides a smoother adaptation behavior. Finally, we demonstrate that our proposed method originally designed for non-scalable video codecs also works efficiently for scalable videos such as Scalable High EfficiencyVideo Coding (SHVC).
FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Le...Alpen-Adria-Universität
HTTP Adaptive Streaming(HAS) is the most common approach for delivering video content over the Internet. Therequirement to encode the same content at different quality levels(i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we propose to use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that FaME-ML achieves significant time-complexity savings in parallel encoding scenarios(41%in average) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.
MiPSO: Multi-Period Per-Scene Optimization For HTTP Adaptive StreamingAlpen-Adria-Universität
Video delivery over the Internet has become more and more established in recent years due to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The current DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of periods, adaptation sets, representations, and segments. Although multi-period MPDs are widely used in live streaming scenarios, they are not fully utilized in Video-on-Demand (VoD) HTTP adaptive streaming (HAS) scenarios. In this paper, we introduce MiPSO, a framework for Multi-Period per-Scene optimization, to examine multiple periods in VoD HAS scenarios. MiPSO provides different encoded representations of a video at either (i) maximum possible quality or (ii) minimum possible bitrate, beneficial to both service providers and subscribers. In each period, the proposed framework adjusts the video representations (resolution-bitrate pairs) by taking into account the complexities of the video content, with the aim of achieving streams at either higher qualities or lower bitrates. The experimental evaluation with a test video data set shows that MiPSO reduces the average bitrate of streams with the same visual quality by approximately 10% or increases the visual quality of streams by at least 1 dB in terms of Peak Signal-to-Noise (PSNR) at the same bitrate compared to conventional approaches.
CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video StreamingAlpen-Adria-Universität
With the increasing demand for video streaming applications, HTTP Adaptive Streaming (HAS) technology has become the dominant video delivery technique over the Internet. Current HAS solutions only consider either client- or server-side optimization, which causes many problems in achieving high-quality video, leading to sub-optimal users’ experience and network resource utilization. Recent studies have revealed that network-assisted HAS techniques, by providing a comprehensive view of the network, can lead to more significant gains in HAS system performance. In this paper, we leverage the capability of Software-Define Networking (SDN), Network Function Virtualization (NFV), and edge computing to introduce a CDN-Aware QoE Optimization in SDN-Assisted Adaptive Video Streaming framework called CSDN. We employ virtualized edge entities to collect various information items (e.g., user-, client, CDN- and network-level information) in a time-slotted method. These components then run an optimization model with a new server/segment selection approach in a time-slotted fashion to serve the clients’ requests by selecting optimal cache servers (in terms of fetch and transcoding times). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, (ii) by a quality transcoded from an optimal replacement quality at the edge, or (iii) by the originally requested quality level from the origin server. By means of comprehensive experiments conducted on a real-world large-scale testbed, we demonstrate that CSDN outperforms the state-of-the-art in terms of playback bitrate, the number of quality switches, the number of stalls, and bandwidth usage by at least 7.5%, 19%, 19%, and 63%, respectively.
WISH: User-centric Bitrate Adaptation for HTTP Adaptive Streaming on Mobile D...Minh Nguyen
Recently, mobile devices have become paramount in online video streaming. Adaptive bitrate (ABR) algorithms of players responsible for selecting the quality of the videos face critical challenges in providing a high Quality of Experience (QoE) for end users. One open issue is how to ensure the optimal experience for heterogeneous devices in the context of extreme variation of mobile broadband networks. Additionally, end users may have different priorities on video quality and data usage (i.e., the amount of data downloaded to the devices through the mobile networks). A generic mechanism for players that enables specification of various policies to meet end users’ needs is still missing. In this paper, we propose a weighted sum model, namely WISH, that yields high QoE of the video and allows end users to express their preferences among different parameters (i.e., data usage, stall events, and video quality) of video streaming. WISH has been implemented into ExoPlayer, a popular player used in many mobile applications. The experimental results show that WISH improves the QoE by up to 17.6% while saving 36.4% of data usage compared to state-of-the-art ABR algorithms and provides dynamic adaptation to end users’ requirements.
High Efficiency Video Coding (HEVC) improves the encoding efficiency by utilizing sophisticated tools such as flexible Coding Tree Unit (CTU) partitioning. The Coding Unit (CU) can be split recursively into four equally sized CUs ranging from 64×64 to 8×8 pixels. At each depth level (or CU size), intra prediction via exhaustive mode search was exploited in HEVC to improve the encoding efficiency and result in a very high encoding time complexity. This paper proposes an Intra CU Depth Prediction (INCEPT) algorithm, which limits Rate-Distortion Optimization (RDO) for each CTU in HEVC by utilizing the spatial correlation with the neighboring CTUs, which is computed using a DCT energy-based feature. Thus, INCEPT reduces the number of candidate CU sizes required to be considered for each CTU in HEVC intra coding. Experimental results show that the INCEPT algorithm achieves a better trade-off between the encoding efficiency and encoding time saving (i.e., BDR/∆T) than the benchmark algorithms. While BDR/∆T is 12.35% and 9.03% for the benchmark algorithms, it is 5.49% for the proposed algorithm. As a result, INCEPT achieves a 23.34% reduction in encoding time on average while incurring only a 1.67% increase in bit rate than the original coding in the x265 HEVC open-source encoder.
Where to Encode: A Performance Analysis of Intel x86 and Arm-based Amazon EC2...Alpen-Adria-Universität
Video streaming became an undivided part of the Internet. To efficiently utilise the limited network bandwidth it is essential to encode the video content. However, encoding is a computationally intensive task, involving high-performance resources provided by private infrastructures or public clouds. Public clouds, such as Amazon EC2, provide a large portfolio of services and instances optimized for specific purposes and budgets. The majority of Amazon’s instances use x86 processors, such as Intel Xeon or AMD EPYC. However, following the recent trends in computer architecture, Amazon introduced Arm based instances that promise up to 40% better cost performance
ratio than comparable x86 instances for specific workloads. We evaluate in this paper the video encoding performance of x86 and Arm instances of four instance families using the latest FFmpeg version and two video codecs. We examine the impact of the encoding parameters, such as different presets and bitrates, on the time and cost for encoding. Our experiments reveal that Arm instances show high time and cost saving potential of up to
33.63% for specific bitrates and presets, especially for the x264 codec. However, the x86 instances are more general and achieve low encoding times, regardless of the codec.
ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Seq...Alpen-Adria-Universität
HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.
EPIQ'21: Days of Future Past: An Optimization-based Adaptive Bitrate Algorith...Minh Nguyen
HTTP Adaptive Streaming (HAS) has become a predominant technique for delivering videos in the Internet. Due to its adaptive behavior according to changing network conditions, it may result in video quality variations that negatively impact the Quality of Experience (QoE) of the user. In this paper, we propose Days of Future Past, an optimization- based Adaptive Bitrate (ABR) algorithm over HTTP/3. Days of Future Past takes advantage of an optimization model and HTTP/3 features, including (i) stream multiplexing and (ii) request cancellation. We design a Mixed Integer Linear Programming (MILP) model that determines the optimal video qualities of both the next segment to be requested and the segments currently located in the buffer. If better qualities for buffered segments are found, the client will send corresponding HTTP GET requests to retrieve them. Multiple segments (i.e., retransmitted segments) might be downloaded simultaneously to upgrade some buffered but not yet played segments to avoid quality decreases using the stream multiplexing feature of QUIC. HTTP/3’s request cancellation will be used in case retransmitted segments will arrive at the client after their playout time. The experimental results shows that our proposed method is able to improve the QoE by up to 33.9%.
HTTP adaptive streaming (HAS) with chunked transfer encoding can be used to reduce latency without sacrificing the coding ef- ficiency. While this allows a media segment to be generated and delivered at the same time, it also causes grossly inaccurate band- width measurements, leading to incorrect bitrate selections. To overcome this effect, we design a novel Adaptive bitrate scheme for Chunked Transfer Encoding (ACTE) that leverages the unique nature of chunk downloads. It uses a sliding window to accurately measure the available bandwidth and an online linear adaptive filter to predict the available bandwidth into the future. Results show that ACTE achieves 96% measurement accuracy, which translates to a 64% reduction in stalls and a 27% increase in video quality.
Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabilities, random access, and uniform quality distribution have not been addressed adequately. In this paper, an efficient light field image compression method based on a deep neural network is proposed, which classifies multiple views into various layers. In each layer, the target view is synthesized from the available views of previously encoded/decoded layers using a deep neural network. This synthesized view is then used as a virtual reference for the target view inter-coding. In this way, random access to an arbitrary view is provided. Moreover, uniform quality distribution among multiple views is addressed. In higher bitrates where random access to an arbitrary view is more crucial, the required bitrate to access the requested view is minimized.
ES-HAS: An Edge- and SDN-Assisted Framework for HTTP Adaptive Video StreamingAlpen-Adria-Universität
Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the originally requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
Video traffic on the Internet is constantly growing; networked multimedia applications consume a predominant share of the available Internet bandwidth. A major technical breakthrough and enabler in multimedia systems research and of industrial networked multimedia services certainly was the HTTP Adaptive Streaming (HAS) technique. This resulted in the standardization of MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) which, together with HTTP Live Streaming (HLS), is widely used for multimedia delivery in today’s networks. Existing challenges in multimedia systems research deal with the trade-off between (i) the ever-increasing content complexity, (ii) various requirements with respect to time (most importantly, latency), and (iii) quality of experience (QoE). Optimizing towards one aspect usually negatively impacts at least one of the other two aspects if not both.
This situation sets the stage for our research work in the ATHENA Christian Doppler (CD) Laboratory (Adaptive Streaming over HTTP and Emerging Networked Multimedia Services; https://athena.itec.aau.at/), jointly funded by public sources and industry.
In this talk, we will present selected novel approaches and research results of the first year of the ATHENA CD Lab’s operation. We will highlight HAS-related research on: (i) multimedia content provisioning (machine learning for video encoding); (ii) multimedia content delivery (support of edge processing and virtualized network functions for video networking); (iii) multimedia content consumption and end-to-end aspects (player-triggered segment retransmissions to improve video playout quality); and (iv) novel QoE investigations (adaptive point cloud streaming). We will also put the work into the context of the international multimedia systems research.
Quality Optimization of Live Streaming Services over HTTP with Reinforcement ...Alpen-Adria-Universität
Recent years have seen tremendous growth in HTTP adaptive live video traffic over the Internet. In the presence of highly dynamic network conditions and diverse request patterns, existing yet simple hand-crafted heuristic approaches for serving client requests at the network edge might incur a large overhead and significant increase in time complexity. Therefore, these approaches might fail in delivering acceptable Quality of Experience (QoE) to end users. To bridge this gap, we propose ROPL, a learning-based client request management solution at the edge that leverages the power of the recent breakthroughs in deep reinforcement learning, to serve requests of concurrent users joining various HTTP-based live video channels. ROPL is able to react quickly to any changes in the environment, performing accurate decisions to serve clients requests, which results in achieving satisfactory user QoE. We validate the efficiency of ROPL through trace-driven simulations and a real-world setup. Experimental results from real-world scenarios confirm that ROPL outperforms existing heuristic-based approaches in terms of QoE, with a factor up to 3.7×.
CAdViSE: Cloud based Adaptive Video Streaming Evaluation Framework for the Au...Alpen-Adria-Universität
Attempting to cope with fluctuations of network conditions in terms of available bandwidth, latency and packet loss, and to deliver the highest quality of video (and audio) content to users, research on adaptive video streaming has attracted intense efforts from the research community and huge investments from technology giants. How successful these efforts and investments are, is a question that needs precise measurements of the results of those technological advancements. HTTP-based Adaptive Streaming (HAS) algorithms, which seek to improve video streaming over the Internet, introduce video bitrate adaptivity in a way that is scalable and efficient.
However, how each HAS implementation takes into account the wide spectrum of variables and configuration options, brings a high complexity to the task of measuring the results and visualizing the statistics of the performance and quality of experience.
In this paper, we introduce CAdViSE, our Cloud-based Adaptive
Video Streaming Evaluation framework for the automated testing
of adaptive media players. The paper aims to demonstrate a test
environment which can be instantiated in a cloud infrastructure,
examines multiple media players with different network attributes
at defined points of the experiment time, and finally concludes the
evaluation with visualized statistics and insights into the results.
This paper deals with the overview of latest video coding standard High-Efficiency Video
Coding (HEVC). Also this work presents a performance comparison of the two latest video coding
standards H.264/MPEG-AVC and H.265/MPEG-HEVC. According to the experimental results, which
were obtained for a whole test set of video sequences by using similar encoding configurations,
H.265/MPEG-HEVC provides significant average bit-rate savings of around 40%.
Keywords: - CABAC, CAVLC, H.264/AVC, HEVC PSNR and SBAC.
H2B2VS (HEVC hybrid broadcast broadband video services) – Building innovative...Raoul Monnier
Broadcast and broadband networks continue to be separate worlds in the video consumption business. Some initiatives such as HbbTV have built a bridge between both worlds, but its application is almost limited to providing links over the broadcast channel to content providers’ applications such as Catch-up TV services. When it comes to reality, the user is using either one network or the other.
H2B2VS is a Celtic-Plus project aiming at exploiting the potential of real hybrid networks by implementing efficient synchronization mechanisms and using new video coding standard such as High Efficiency Video Coding (HEVC). The goal is to develop successful hybrid network solutions that enable value added services with an optimum bandwidth usage in each network and with clear commercial applications. An example of the potential of this approach is the transmission of Ultra-HD TV by sending the main content over the broadcast channel and the required complementary information over the broadband network. This technology can also be used to improve the life of handicapped persons: Deaf people receive through the broadband network a sign language translation of a programme sent over the broadcast channel; the TV set then displays this translation in an inset window.
One of the most important contributions of the project is developing and testing synchronization methods between two different networks that offer unequal qualities of service with significant differences in delay and jitter.
In this paper, the main technological project contributions are described, including SHVC, the scalable extension of HEVC and a special focus on the synchronization solution adopted by MPEG and DVB. The paper also presents some of the implemented practical use cases, such as the sign language translation described above, and their performance results so as to evaluate the commercial application of this type of solution.
Requiring only half the bitrate of its predecessor, the new standard – HEVC or H.265 – will significantly reduce the need for bandwidth and expensive, limited spectrum. HEVC (H.265) will enable the launch of new video services and in particular ultra HD television (UHDTV).
State-of-the-art video compression techniques – HEVC/H.265 – can reduce the size of raw video by a factor of about 100 without any noticeable reduction in visual quality. With estimates indicating that compressed real-time video accounts for more than 50 percent of current network traffic, and this figure is set to rise to 90 percent within a few years, HEVC/H.265 will be a welcome relief for network operators.
New services, devices and changing viewing patterns are among the factors contributing to the growth in video traffic as people watch more and more traditional TV and video-streaming services on their mobile devices.
Ericsson has been heavily involved in the standardization of HEVC since it began in 2010, and this Ericsson Review article highlights some of the contributions that have led to the compression efficiency offered by HEVC.
Requiring only half the bitrate of its predecessor, the new standard – HEVC or H.265 – will significantly reduce the need for bandwidth and expensive, limited spectrum. HEVC (H.265) will enable the launch of new video services and in particular ultra HD television (UHDTV).
State-of-the-art video compression techniques – HEVC/H.265 – can reduce the size of raw video by a factor of about 100 without any noticeable reduction in visual quality. With estimates indicating that compressed real-time video accounts for more than 50 percent of current network traffic, and this figure is set to rise to 90 percent within a few years, HEVC/H.265 will be a welcome relief for network operators.
New services, devices and changing viewing patterns are among the factors contributing to the growth in video traffic as people watch more and more traditional TV and video-streaming services on their mobile devices.
Ericsson has been heavily involved in the standardization of HEVC since it began in 2010, and this Ericsson Review article highlights some of the contributions that have led to the compression efficiency offered by HEVC.
.
The latest video compression standard, H.264 (also known as MPEG-4 Part 10/AVC for Advanced Video
Coding), is expected to become the video standard of choice in the coming years.
H.264 is an open, licensed standard that supports the most efficient video compression techniques available
today. Without compromising image quality, an H.264 encoder can reduce the size of a digital video file by
more than 80% compared with the Motion JPEG format and as much as 50% more than with the MPEG-4
Part 2 standard. This means that much less network bandwidth and storage space are required for a video
file. Or seen another way, much higher video quality can be achieved for a given bit rate.
VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instancesAlpen-Adria-Universität
Video streaming constitutes 65 % of global internet traffic, prompting an investigation into its energy consumption and CO2 emissions. Video encoding, a computationally intensive part of streaming, has moved to cloud computing for its scalability and flexibility. However, cloud data centers’ energy consumption, especially video encoding, poses environmental challenges. This paper presents VEED, a FAIR Video Encoding Energy and CO2 Emissions Dataset for Amazon Web Services (AWS) EC2 instances. Additionally, the dataset also contains the duration, CPU utilization, and cost of the encoding. To prepare this dataset, we introduce a model and conduct a benchmark to estimate the energy and CO2 emissions of different Amazon EC2 instances during the encoding of 500 video segments with various complexities and resolutions using Advanced Video Coding (AVC)
and High-Efficiency Video Coding (HEVC). VEED and its analysis can provide valuable insights for video researchers and engineers to model energy consumption, manage energy resources, and distribute workloads, contributing to the sustainability of cloud-based video encoding and making them cost-effective. VEED is available at Github.
Addressing climate change requires a global decrease in greenhouse gas (GHG) emissions. In today’s digital landscape, video streaming significantly influences internet traffic, driven by the widespread use of mobile devices and the rising popularity of streaming plat-
forms. This trend emphasizes the importance of evaluating energy consumption and the development of sustainable and eco-friendly video streaming solutions with a low Carbon Dioxide (CO2) footprint. We developed a specialized tool, released as an open-source library called GREEM , addressing this pressing concern. This tool measures video encoding and decoding energy consumption and facilitates benchmark tests. It monitors the computational impact on hardware resources and offers various analysis cases. GREEM is helpful for developers, researchers, service providers, and policy makers interested in minimizing the energy consumption of video encoding and streaming.
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Alpen-Adria-Universität
In HTTP adaptive live streaming applications, video segments are encoded at a fixed set of bitrate-resolution pairs known as bitrate ladder. Live encoders use the fastest available encoding configuration, referred to as preset, to ensure the minimum possible latency in video encoding. However, an optimized preset and optimized number of CPU threads for each encoding instance may result in (i) increased quality and (ii) efficient CPU utilization while encoding. For low latency live encoders, the encoding speed is expected to be more than or equal to the video framerate. To this light, this paper introduces a Just Noticeable Difference (JND)-Aware Low latency Encoding Scheme (JALE), which uses random forest-based models to jointly determine the optimized encoder preset and thread count for each representation, based on video complexity features, the target encoding speed, the total number of available CPU threads, and the target encoder. Experimental results show that, on average, JALE yield a quality improvement of 1.32 dB PSNR and 5.38 VMAF points with the same bitrate, compared to the fastest preset encoding of the HTTP Live Streaming (HLS) bitrate ladder using x265 HEVC open-source encoder with eight CPU threads used for each representation. These enhancements are achieved while maintaining the desired encoding speed. Furthermore, on average, JALE results in an overall storage reduction of 72.70%, a reduction in the total number of CPU threads used by 63.83%, and a 37.87% reduction in the overall encoding time, considering a JND of six VMAF points.
In the context of rising environmental concerns, this paper introduces VEEP, an architecture designed to predict energy consumption and CO2 emissions in cloud-based video encoding. VEEP combines video analysis with machine learning (ML)-based energy prediction and real-time carbon intensity, enabling precise estimations of CPU energy usage and CO2 emissions during the encoding process. It is trained on the Video Complexity Dataset (VCD) and encoding results from various AWS EC2 instances. VEEP achieves high accuracy, indicated by an 𝑅2-score of 0.96, a mean absolute error (MAE) of 2.41 × 10−5, and a mean squared error (MSE) of 1.67 × 10−9. An important finding is the potential to reduce emissions by up to 375 times when comparing cloud instances and their locations. These results highlight the importance of considering environmental factors in cloud computing.
In today’s dynamic streaming landscape, where viewers access content on various devices and en- counter fluctuating network conditions, optimizing video delivery for each unique scenario is impera- tive. Video content complexity analysis, content-adaptive video coding, and multi-encoding methods are fundamental for the success of adaptive video streaming, as they serve crucial roles in delivering high-quality video experiences to a diverse audience. Video content complexity analysis allows us to comprehend the video content’s intricacies, such as motion, texture, and detail, providing valuable insights to enhance encoding decisions. By understanding the content’s characteristics, we can effi- ciently allocate bandwidth and encoding resources, thereby improving compression efficiency without compromising quality. Content-adaptive video coding techniques built upon this analysis involve dy- namically adjusting encoding parameters based on the content complexity. This adaptability ensures that the video stream remains visually appealing and artifacts are minimized, even under challenging network conditions. Multi-encoding methods further bolster adaptive streaming by offering faster encoding of multiple representations of the same video at different bitrates. This versatility reduces computational overhead and enables efficient resource allocation on the server side. Collectively, these technologies empower adaptive video streaming to deliver optimal visual quality and uninter- rupted viewing experiences, catering to viewers’ diverse needs and preferences across a wide range of devices and network conditions. Embracing video content complexity analysis, content-adaptive video coding, and multi-encoding methods is essential to meet modern video streaming platforms’ evolving demands and create immersive experiences that captivate and engage audiences. In this light, this dissertation proposes contributions categorized into four classes:
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video...Alpen-Adria-Universität
Quality of Experience (QoE) and QoE models are of an increasing importance to networked systems. The traditional QoE modeling for video streaming applications builds a one-size-fits-all QoE model that underserves atypical viewers who perceive QoE differently. To address the problem of atypical viewers, this paper proposes iQoE (individualized QoE), a method that employs explicit, expressible, and actionable feedback from a viewer to construct a personalized QoE model for this viewer. The iterative iQoE design exercises active learning and combines a novel sampler with a modeler. The chief emphasis of our paper is on making iQoE sample-efficient and accurate.
By leveraging the Microworkers crowdsourcing platform, we conduct studies with 120 subjects who provide 14,400 individual scores. According to the subjective studies, a session of about 22 minutes empowers a viewer to construct a personalized QoE model that, compared to the best of the 10 baseline models, delivers the average accuracy improvement of at least 42% for all viewers and at least 85% for the atypical viewers. The large-scale simulations based on a new technique of synthetic profiling expand the evaluation scope by exploring iQoE design choices, parameter sensitivity, and generalizability.
Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Vid...Alpen-Adria-Universität
Quality of Experience (QoE) and QoE models are of an increasing importance to networked systems. The traditional QoE modeling for video streaming applications builds a one-size-fits-all QoE model that underserves atypical viewers who perceive QoE differently. To address the problem of atypical viewers, this paper proposes iQoE (individualized QoE), a method that employs explicit, expressible, and actionable feedback from a viewer to construct a personalized QoE model for this viewer. The iterative iQoE design exercises active learning and combines a novel sampler with a modeler. The chief emphasis of our paper is on making iQoE sample-efficient and accurate.
By leveraging the Microworkers crowdsourcing platform, we conduct studies with 120 subjects who provide 14,400 individual scores. According to the subjective studies, a session of about 22 minutes empowers a viewer to construct a personalized QoE model that, compared to the best of the 10 baseline models, delivers the average accuracy improvement of at least 42% for all viewers and at least 85% for the atypical viewers. The large-scale simulations based on a new technique of synthetic profiling expand the evaluation scope by exploring iQoE design choices, parameter sensitivity, and generalizability.
Optimizing Video Streaming for Sustainability and Quality: The Role of Prese...Alpen-Adria-Universität
HTTP Adaptive Streaming (HAS) methods divide a video into smaller segments, encoded at multiple pre-defined bitrates to construct a bitrate ladder. Bitrate ladders are usually optimized per title over several dimensions, such as bitrate, resolution, and framerate. This paper adds a new dimension to the bitrate ladder by considering the energy consumption of the encoding process. Video encoders often have multiple pre-defined presets to balance the trade-off between encoding time, energy consumption, and compression efficiency. Faster presets disable certain coding tools defined by the codec to reduce the encoding time at the cost of reduced compression efficiency. Firstly, this paper evaluates the energy consumption and compression efficiency of different x265 presets for 500 video sequences. Secondly, optimized presets are selected for various representations in a bitrate ladder based on the results to guarantee a minimal drop in video quality while saving energy. Finally, a new per-title model, which optimizes the trade-off between compression efficiency and energy consumption, is proposed. The experimental results show that decreasing the VMAF score by 0.15 and 0.39 while choosing an optimized preset results in encoding energy savings of 70% and 83%, respectively.
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Alpen-Adria-Universität
With the emergence of multiple modern video codecs, streaming service providers are forced to encode, store, and transmit bitrate ladders of multiple codecs separately, consequently suffering from additional energy costs for encoding, storage, and transmission.
To tackle this issue, we introduce an online energy-efficient Multi-Codec Bitrate ladder Estimation scheme (MCBE) for adaptive video streaming applications. In MCBE, quality representations within the bitrate ladder of new-generation codecs (e.g., HEVC, AV1) that lie below the predicted rate-distortion curve of the AVC codec are removed. Moreover, perceptual redundancy between representations of the bitrate ladders of the considered codecs is also minimized based on a Just Noticeable Difference (JND) threshold. Therefore, random forest-based models predict the VMAF of bitrate ladder representations of each codec. In a live streaming session where all clients support the decoding of AVC, HEVC, and AV1, MCBE achieves impressive results, reducing cumulative encoding energy by 56.45%, storage energy usage by 94.99%, and transmission energy usage by 77.61% (considering a JND of six VMAF points). These energy reductions are in comparison to a baseline bitrate ladder encoding based on current industry practice.
Machine Learning Based Resource Utilization Prediction in the Computing Conti...Alpen-Adria-Universität
This paper presents UtilML, a novel approach for tackling resource utilization prediction challenges in the computing continuum. UtilML leverages Long-Short-Term Memory (LSTM) neural networks, a machine learning technique, to forecast resource utilization accurately. The effectiveness of UtilML is demonstrated through its evaluation of data extracted from a real GPU cluster in a computing continuum infrastructure comprising more than 1800 computing devices. To assess the performance of UtilML, we compared it with two related approaches that utilize a Baseline-LSTM model. Furthermore, we analyzed the LSTM results against User-Predicted values provided by GPU cluster owners for task deployment with estimated allocation values. The results indicate that UtilML outperformed user predictions by 2% to 27% for CPU utilization prediction. For memory prediction, UtilML variants excelled, showing improvements of 17% to 20% compared to user predictions.
The exponential growth of computer game streaming has led to the development of Quality of Experience (QoE) metrics to evaluate user satisfaction and enjoyment during online gameplay and live streaming. Adaptive Bitrate (ABR) streaming is a recent technology that has been suggested to improve QoE. This method enhances the streaming experience, upholds visual quality, minimizes stall events, and boosts player retention. It achieves this by estimating network bottlenecks and selecting appropriate versions of the content that best match the available bandwidth rather than adjusting encoding parameters. To investigate the correlation between quality switching and stall events, a subjective test was conducted separately and comparatively with 71 participants. For more detailed and in-depth research, video games were analyzed with the Video Complexity Analyzer (VCA) tool and divided into three categories of different genres, camera view, and temporal complexity heatmap from the two sets of normal and action scenes. This study seeks to shed light on three unresolved issues pertinent to QoE in game streaming: (i) the user preferences towards quality switching and stall events across varied scenes and games, (ii) the user inclinations towards either a single, prolonged stall event or multiple, shorter stall events, and (iii) the impact of conspicuous quality switching on the user’s QoE. Results from the study provided valuable insights, both qualitatively and quantitatively. The study found a marked preference among users for quality switching over stall events across all types of game streaming, irrespective of the scene’s intensity. Furthermore, it was observed that multiple short-stall events were generally favored over a single long-stall event in streaming first-person shooting games. Interestingly, approximately half of the participants remained oblivious to quality switching during their game viewing sessions, and among those who noticed a change in quality, the alteration did not significantly impact their perceived QoE.
Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, S...Alpen-Adria-Universität
Multimedia applications, mainly video streaming services, are currently the dominant source of network load worldwide. In recent Video-on-Demand (VoD) and live video streaming services, traditional streaming delivery techniques have been replaced by adaptive solutions based on the HTTP protocol. Current trends toward high-resolution (e.g., 8K) and/or low- latency VoD and live video streaming pose new challenges to end-to-end (E2E) bandwidth demand and have stringent delay requirements. To do this, video providers typically rely on Content Delivery Networks (CDNs) to ensure that they provide scalable video streaming services. To support future streaming scenarios involving millions of users, it is necessary to increase the CDNs’ efficiency. It is widely agreed that these requirements may be satisfied by adopting emerging networking techniques to present Network-Assisted Video Streaming (NAVS) methods. Motivated by this, this thesis goes one step beyond traditional pure client- based HAS algorithms by incorporating (an) in-network component(s) with a broader view of the network to present completely transparent NAVS solutions for HAS clients.
Over the last recent years, video streaming traffic has become the dominating service over mobile networks. The two main reasons for the growth of video streaming traffic are the improved capabilities of mobile devices and the emergence of HTTP Adaptive Streaming (HAS). Hence, there is a demand for new technologies to cope with the increasing traffic load while improving clients’ Quality of Experience (QoE). The network plays a crucial role in the video streaming process. One of the key technologies on the network side is Multi-access Edge Computing (MEC), which has several key characteristics: computing power, storage, proximity to the clients and access to network and player metrics. Thus, it is possible to deploy mechanisms at the MEC node that assist video streaming.
This thesis investigates how MEC capabilities can be leveraged to support video streaming delivery, specifically to improve the QoE, reduce latency or increase storage and bandwidth savings.
In the last decades, video streaming has been developing significantly. Among cur- rent technologies, HTTP Adaptive Streaming (HAS) is considered the de-facto approach in multimedia transmission over the internet. In HAS, the video is split into temporal segments with the same duration (e.g., 4s), each of which is then encoded into different quality versions and stored at servers. The end user sends requests to the server to retrieve segments with specific quality versions determined by an Adaptive Bitrate (ABR) algorithm for the purpose of adapting the throughput fluctuation. Though the majority of HAS-based media services function well even under throughput restrictions and variations, there are still significant challenges for multimedia systems, especially the tradeoff among the increasing content complexity, various time-related requirements, and Quality of Experience (QoE). Content complexity encompasses the increased demands for data, such as high-resolution videos and high frame rates, as well as novel content formats, such as virtual reality (VR) and augmented reality (AR). Time-related requirements include – but are not limited to – start-up delay and end-to-end latency. QoE can be defined as the level of satisfaction or frustration experienced by the user of an application or service. Optimizing for one aspect usually negatively impacts at least one of the other two aspects. This thesis tackles critical open research questions in the context of HAS that significantly impact the QoE at the client side.
VE-Match: Video Encoding Matching-based Model for Cloud and Edge Computing In...Alpen-Adria-Universität
The considerable surge in energy consumption within data centers can be attributed to the exponential rise in demand for complex computing workflows and storage resources. Video streaming applications are both compute and storage-intensive and account for the majority of today’s internet services. In this work, we designed a video encoding application consisting of codec, bitrate, and resolution set for encoding a video segment. Then, we propose VE-Match, a matching-based method to schedule video encoding applications on both Cloud and Edge resources to optimize costs and energy consumption. Evaluation results on a real computing testbed federated between Amazon Web Services (AWS) EC2 Cloud instances and the Alpen-Adria University (AAU) Edge server reveal that VE-Match achieves lower costs by 17%-78% in the cost-optimized scenarios compared to the energy-optimized and tradeoff between cost and energy. Moreover, VE-Match improves the video encoding energy consumption by 38%-45% and gCO2 emission by up to 80 % in the energy-optimized scenarios compared to the cost-optimized and tradeoff between cost and energy.
Energy Consumption in Video Streaming: Components, Measurements, and StrategiesAlpen-Adria-Universität
The rapid growth of video streaming usage is a significant source of energy consumption, driven by improved internet connections and service offerings, the quick development of video entertainment, the deployment of Ultra High-Definition, Virtual and Augmented Reality, as well as an increasing number of video surveillance and IoT applications. To address this challenge, it is essential to understand the various components involved in energy consumption during video streaming, ranging from video encoding to decoding and displaying the video on the end user’s screen. Then, it is critical to measure energy consumption for each component accurately and conduct an in-depth analysis to develop energy-efficient strategies that optimize video streaming [1, 2, 3]. These components are classified into three categories [4]: (i) data centers, which include encoding, packaging, and storage on cloud data centers; (ii) networks, which include core network and access networks; and (iii) end-user devices which involve decoding, players, hardware, etc.
In addition to identifying the primary components of video streaming that affect energy consumption, it is important to conduct a comprehensive analysis of the entire video streaming. It is also essential to balance energy optimization and service quality to ensure that energyefficient strategies are implemented without sacrificing the quality of video streaming services.
This talk aims to provide insights into the components of video streaming that contribute to energy consumption and highlight the challenges associated with measuring their energy usage. I will also introduce the tools that can be used for energy measurements for those components and the possible and associated strategies that lie within energy efficiency. By accurately measuring energy consumption, digital media companies can effectively monitor and control their energy usage, ultimately leading to cost savings and improved sustainability.
Exploring the Energy Consumption of Video Streaming: Components, Challenges, ...Alpen-Adria-Universität
The rapid growth of video streaming usage is a significant source of energy consumption, driven by improved internet connections and service offerings, the quick development of video entertainment, the deployment of Ultra High-Definition, Virtual and Augmented Reality, as well as an increasing number of video surveillance and IoT applications. However, it is essential to note that these advancements come at the cost of energy consumption. To address this challenge, it is essential to understand the various components involved in energy consumption during video streaming, ranging from video encoding to decoding and displaying the video on the end user’s screen. Then, it is critical to accurately measure energy consumption for each component and conduct an in-depth analysis to develop energy-efficient strategies that optimize video streaming. I categorize these components into three categories: (i) data centers, (ii) networks, and (iii) end-user devices.
In this talk, my objective is to provide insights into the components of video streaming that contribute to energy consumption and highlight the challenges associated with measuring their energy usage. I will also introduce the tools that can be used for energy measurements for those components and the possible and associated strategies that lie within energy efficiency. By accurately measuring energy consumption, digital media companies can effectively monitor and control their energy usage, ultimately leading to cost savings and improved sustainability.
Video Coding Enhancements for HTTP Adaptive Streaming Using Machine LearningAlpen-Adria-Universität
Video is evolving into a crucial tool as daily lives are increasingly centered around visual communication. The demand for better video content is constantly rising, from entertainment to business meetings. The delivery of video content to users is of utmost significance. HTTP adaptive streaming, in which the video content adjusts to the changing network circumstances, has become the de-facto method for delivering internet video.
As video technology continues to advance, it presents a number of challenges, one of which is the large amount of data required to describe a video accurately. To address this issue, it is necessary to have a powerful video encoding tool. Historically, these efforts have relied on hand-crafted tools and heuristics. However, with the recent advances in machine learning, there has been increasing exploration into using these techniques to enhance video coding performance.
This thesis proposes eight contributions that enhance video coding performance for HTTP adaptive streaming using machine learning.
Optimizing QoE and Latency of Live Video Streaming Using Edge Computing a...Alpen-Adria-Universität
Nowadays, HTTP Adaptive Streaming (HAS) has become the de-facto standard for delivering video over the Internet. More users have started generating and delivering high-quality live streams (usually 4K resolution) through popular online streaming platforms, resulting in a rise in live streaming traffic. Typically, the video contents are generated by streamers and watched by many audiences, geographically distributed in various locations far away from the streamers. The resource limitation in the network (e.g., bandwidth) is a challenging issue for network and video providers to meet the users’ requested quality. This dissertation leverages edge computing capabilities and in-network intelligence to design, implement, and evaluate approaches to optimize Quality of Experience (QoE) and end-to-end (E2E) latency of live HAS. In addition, improving transcoding performance and optimizing the cost of running live HAS services and the network’s backhaul utilization are considered. Motivated by the mentioned issue, the dissertation proposes five contributions in two classes: optimizing resource utilization and light-weight transcoding.
SARENA: SFC-Enabled Architecture for Adaptive Video Streaming ApplicationsAlpen-Adria-Universität
5G and 6G networks are expected to support various novel emerging adaptive video streaming services (e.g., live, VoD, immersive media, and online gaming) with versatile Quality of Experience (QoE) requirements such as high bitrate, low latency, and sufficient reliability. It is widely agreed that these requirements can be satisfied by adopting emerging networking paradigms like Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing. Previous studies have leveraged these paradigms to present network-assisted video streaming frameworks, but mostly in isolation without devising chains of Virtualized Network Functions (VNFs) that consider the QoE requirements of various types of Multimedia Services (MS). To bridge the aforementioned gaps, we first introduce a set of multimedia VNFs at the edge of an SDN-enabled network, form diverse Service Function Chains (SFCs) based on the QoE requirements of different MS services. We then propose SARENA, an SFC-enabled ArchitectuRe for adaptive VidEo StreamiNg Applications. Next, we formulate the problem as a central scheduling optimization model executed at the SDN controller. We also present a lightweight heuristic solution consisting of two phases that run on the SDN controller and edge servers to alleviate the time complexity of the optimization model in
large-scale scenarios. Finally, we design a large-scale cloud-based testbed, including 250 HTTP Adaptive Streaming (HAS) players requesting two popular MS applications (i.e., live and VoD), conduct various experiments, and compare its effectiveness with baseline systems. Experimental results illustrate that SARENA outperforms baseline schemes in terms of users’ QoE by at least 39.6%, latency by 29.3%, and network utilization by 30% in both MS services.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
JMeter webinar - integration with InfluxDB and Grafana
Press Release of 131st WG11 (MPEG) Meeting
1. ISO/IEC JTC 1/SC 29/WG 11 N19387
ISO/IEC JTC 1/SC 29/WG 11
Coding of moving pictures and audio
Convenorship: Japan (JISC)
Document type: Press Release
Title: Press Release of 131st WG11 Meeting
Status: Approved
Date of document: 2020-04-24
Source: Convenor
Expected action: INFO
No. of pages: 10
Email of convenor: ostermann@tnt.uni-hannover.de
Committee URL: https://isotc.iso.org/livelink/livelink/open/jtc1sc29wg11
2. INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC 1/SC 29/WG 11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC 1/SC 29/WG 11 N19387
Online Meeting – July 2020
Source: Convenorof ISO/IEC JTC 1/SC 29/WG 11 (MPEG)
Status: Approvedby WG11
Subject: WG 11 (MPEG) PressRelease
Date: 3 July2020
WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard
The 131st WG 11 (MPEG) meeting was held online, 29 June – 3 July 2020
Table of Contents:
WG11 (MPEG) ANNOUNCES VVC – THE VERSATILE VIDEO CODING STANDARD ......................................................3
POINT CLOUD COMPRESSION – WG11 (MPEG) PROMOTES A VIDEO-BASED POINT CLOUD COMPRESSION
TECHNOLOGY TO THE FDIS STAGE.......................................................................................................................................4
MPEG-H 3D AUDIO – WG11 (MPEG) PROMOTES BASELINE PROFILE FOR 3D AUDIO TO FINAL STAGE...............4
CALL FOR PROPOSALS ON TECHNOLOGIES FOR MPEG-21 CONTRACTS TO SMART CONTRACTS CONVERSION 6
WG11 (MPEG) ISSUES A CALL FOR PROPOSALS ON EXTENSION AND IMPROVEMENTS TO ISO/IEC 23092
STANDARD SERIES...................................................................................................................................................................6
WIDENING SUPPORT FOR STORAGE AND DELIVERY OF MPEG-5 EVC .........................................................................6
MULTI-IMAGE APPLICATION FORMAT ADDS SUPPORT OF HDR ..................................................................................7
CARRIAGE OF GEOMETRY-BASED POINT CLOUD DATA PROGRESSES TO COMMITTEE DRAFT..............................7
MPEG IMMERSIVE VIDEO (MIV) PROGRESSES TO COMMITTEE DRAFT......................................................................8
NEURAL NETWORK COMPRESSION FOR MULTIMEDIA APPLICATIONS – WG11 (MPEG) PROGRESSES TO
COMMITTEE DRAFT ................................................................................................................................................................8
WG11 (MPEG) ISSUES COMMITTEE DRAFT OF CONFORMANCE AND REFERENCE SOFTWARE FOR ESSENTIAL
VIDEO CODING (EVC)..............................................................................................................................................................8
WEBINAR: WHAT’S NEW IN MPEG? ....................................................................................................................................9
HOW TO CONTACT WG 11 (MPEG) AND FURTHER INFORMATION .............................................................................9
3. WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard
WG11 (MPEG) is pleased to announce the completion of the new Versatile Video Coding
(VVC) standard at its 131st meeting. The document has been progressed to its final approval
ballot as ISO/IEC 23090-3 and will also be known as H.266 in the ITU-T.
VVC is the latestin a series of very successful standards for video coding that have been jointly
developed with ITU-T, and it is the direct successor to the well-known and widely used HEVC
(Rec. ITU-T H.265 | ISO/IEC 23008-2) and AVC (Rec. ITU-T H.264 | ISO/IEC 14496-10)
standards. VVC provides a major benefit in compression over HEVC. Plans are underway to
conduct a verification test with formal subjective testing to confirm that VVC achieves an
estimated 50% bit rate reduction versus HEVC for equal subjective video quality. Test results
have already demonstrated that VVC typically provides about a 40%-bit rate reduction for
4K/UHD video sequences in tests using objective metrics. Application areas especially
targeted for the use of VVC include ultra-high definition 4K and 8K video, video with a high
dynamic range and wide colour gamut, and video for immersive media applications such as
360° omnidirectional video. Conventional standard-definition and high-definition video
content are also supported with similar gains in compression. In addition to improving coding
efficiency, VVC also provides highly flexible syntax supporting such use cases as subpicture
bitstream extraction, bitstream merging, temporal sublayering and layered coding scalability.
The VVC standard includes the specification of six profiles to serve the needs of industry in a
wide variety of applications. These include the “Main 10” profile that supports 8- and 10-bit
4:2:0 video, the “Main 10 4:4:4” profile with 4:4:4 and 4:2:2 format support, corresponding
“Multilayer Main 10” and “Multilayer Main 10 4:4:4” profiles with support for layered coding,
and the “Main 10 Still Picture” and “Main 10 4:4:4 Still Picture” profiles for still image coding
employing the same coding tools as in the corresponding video profiles.
MPEG also announces completion of ISO/IEC 23002-7 “Versatile supplemental enhancement
information for coded video bitstreams” (VSEI), developed jointly with ITU-T as Rec. ITU-T
H.274. The new VSEI standard specifies the syntax and semantics of video usability
information (VUI) parameters and supplemental enhancement information (SEI) messages
for use with coded video bitstreams. VSEI is especially intended for use with VVC, although it
is drafted to be generic and flexible so that it may also be used with other types of coded
video bitstreams. Once specified in VSEI, different video coding standards and systems-
environment specifications can re-use the same SEI messages without the need for defining
special-purpose data customized to the specific usage context.
4. Point Cloud Compression – WG11 (MPEG) promotes a Video-based Point Cloud
Compression Technology to the FDIS stage
At its 131st meeting, WG11 (MPEG) promoted its Video-based Point Cloud Compression (V-
PCC)standard to Final Draft International Standard (FDIS) stage.V-PCC addresses lossless and
lossycoding of 3Dpoint clouds with associatedattributes suchas colors and reflectance. Point
clouds are typically represented by extremely large amounts of data, which is a significant
barrier for mass market applications. However, the relative ease to capture and render spatial
information as point clouds compared to other volumetric video representations makes point
clouds increasingly popular to present immersive volumetric data. With the current V-PCC
encoder implementation providing a compression in the range of 100:1 to 300:1, a dynamic
point cloud of one million points could be encoded at 8 Mbit/s with good perceptual quality.
Real-time decoding and rendering of V-PCC bitstreams has also been demonstrated on
current mobile hardware.
The V-PCC standard leverages video compression technologies and the video eco-system in
general (hardware acceleration,transmission services and infrastructure), while enabling new
kinds of applications. The V-PCC standard contains several profiles that leverage existing AVC
and HEVC implementations, which may make them suitable to run on existing and emerging
platforms. The standard is also extensible to upcoming video specifications such as Versatile
Video Coding (VVC) and Essential Video Coding (EVC).
The V-PCC standard is based on Visual Volumetric Video-based Coding (V3C), which is
expected to be re-used by other MPEG-I volumetric codecs under development. MPEG is also
developing a standard for carriage of V-PCC and V3C data (ISO/IEC 23090-10) which has been
promoted to DIS status at the 130th MPEG meeting.
By providing high-level immersiveness at currently available bandwidths, the V-PCC standard
is expected to enable several types of applications and services such as six Degrees of
Freedom (6 DoF) immersive media, virtual reality (VR) / augmented reality (AR), immersive
real-time communication and cultural heritage.
MPEG-H 3D Audio – WG11 (MPEG) promotes Baseline Profile for 3D Audio to final stage
At its 131st meeting, WG11 (MPEG) announces the completion of the new ISO/IEC 23008-
3:2019, Amendment 2, "3D Audio Baseline profile, Corrections and Improvements," which
has been promoted to Final Draft Amendment (FDAM) status. This amendment introduces a
new profile called Baseline profile addressing industry demands. Tailored for broadcast,
streaming, and high-quality immersive music delivery usecases,the 3D Audio Baselineprofile
supports channel and object signals and is a subset of the existing Low Complexity profile.
The 3D Audio Baseline profile can be signaled in a backwards compatible fashion, enabling
interoperability with existing devices implementing the 3D Audio Low Complexity profile. In
addition to its advanced loudness and Dynamic Range Control (DRC), interactivity and
accessibility features, the Baseline profile enables the usage of up to 24 audio objects in Level
3 for high quality immersive music delivery.
At the same time, MPEG initiates New Editions at Committee Draft (CD) status for MPEG-H
3D Audio Reference Software and Conformance which incorporate the 3D Audio Baseline
profile functionality.
5. In addition to finalizing the Amendment, WG11 made available the “MPEG-H 3D Audio
Baseline Profile Verification Test Report”. This reports on the results of five subjective
listening tests assessing the performance of the 3D Audio Baseline profile. Covering a wide
range of bit rates and immersive audio use cases, the tests were conducted in nine different
test sites with a total of 341 listeners.
Analysis of the test data resulted in the following conclusions:
Test 1 measured performance for the “Ultra-HD Broadcast” use case, in which highly
immersive audio material was coded at 768 kb/s and presented using 22.2 or 7.1+4H
channel loudspeaker layouts. The test showed that at the bit rate of 768 kb/s, the 3D
Audio Baseline Profile easily achieves “ITU-R High-Quality Emission” quality, as
needed in broadcast applications.
Test 2 measured performance for the “HD Broadcast” or “A/V Streaming” use case, in
which immersive audio material was coded at three bit rates: 512 kb/s, 384 kb/s and
256 kb/s and presented using 7.1+4H or 5.1+2H channel loudspeaker layouts. The test
showed that for all bit rates, the 3D Audio Baseline Profile achieved a quality of
“Excellent” on the MUSHRA subjective quality scale.
Test 3 measured performance for the “High Efficiency Broadcast” use case, in which
audio material was coded at three bit rates, with specific bit rates depending on the
number of channels in the material. Bitrates ranged from 256 kb/s (5.1+2H) to 48 kb/s
(stereo). The test showed that for all bit rates, the 3D Audio Baseline Profile achieved
a quality of “Excellent” on the MUSHRA subjective quality scale.
Test 4 measured performance for the “Mobile” use case, in which immersive audio
material was coded at 384 kb/s, and presented via headphones. The 3D Audio FD
binaural renderer was used to render a virtual, immersive audio sound stage for the
headphone presentation. The test showed that at 384 kb/s, the 3D Audio Baseline
Profile with binaural rendering achieved a quality of “Excellent” on the MUSHRA
subjective quality scale.
Test 5 measured performance for the "High Quality Immersive Music Delivery" use
case in which object based immersive music is delivered to the receiver with up to 24
objects at high per object bit rates. This test used 11.1 (as 7.1+4H) as presentation
format, with material coded at a rate of 1536 kb/s. The test showed that at that bit
rate, the 3D Audio Baseline Profile easily achieves "ITU-R High-Quality Emission"
quality, as needed in high quality music delivery applications.
6. Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
In the last few years, WG11 (MPEG) has developed a number of standardized ontologies
catering to the needs of the music and media industry with respect to codification of
Intellectual Property Rights (IPR)information toward the fair trade of musicand media. MPEG
IPR ontologies and contract expression languages have been developed under the MPEG-21
Multimedia Framework (ISO/IEC 21000) family of standards. MPEG IPR ontologies and
contracts can be used by music and media value chain stakeholders to share and exchange in
an interoperable way all metadata and contractual information. However, a challenge has
been identified, that is, how MPEG IPR ontologies and contracts can be converted to smart
contracts that can be executed on existing blockchain environments and, thus, enriching
blockchain environments with inference and reasoning capabilities inherently associatedwith
ontologies? By addressing this challenge in a standard way for several smart contract
languages would also ensure that MPEG IPR ontologies and contracts prevail as the
interlingua for transferring verified contractual data from one blockchain to another.
At its 131st meeting, MPEG issued a Call for Proposals (CfP) on technologies for MPEG-21 IPR
contracts to smart contracts conversion. All parties that believe they have relevant
technologies are invited to submit proposals for consideration by MPEG. These parties do not
necessarily have to be MPEG members. The review of the submissions is planned in the
context of the 132nd MPEG meeting. Please contact Jörn Ostermann (ostermann@tnt.uni-
hannover.de) for details on attending this meeting if you are not an MPEG delegate.
WG11 (MPEG) issues a Call for Proposals on extension and improvements to
ISO/IEC 23092 standard series
The current MPEG-G standard series (ISO/IEC 23092) is the first generation of MPEG
standards that address the representation, compression, and transport of genome
sequencing data, supporting with a single unified approach data from the output of
sequencing machines up to secondary and tertiary analysis. New technology for compressing
and indexing a wide variety of annotation data is currently under advanced standardization
phase.
In line with the traditional MPEG practice of investigating and applying whenever possible
improvements to the performance and functionality of its standards, at its 131st meeting,
MPEG has issued a Call for Proposals (CfP) addressing two specific objectives: (i) to increase
the speed performance of massively parallel codec implementations and (ii) to enable
advanced queries and search capabilities on the compressed data.
Answers to the CfP are expected to be evaluated prior to the 132nd MPEG meeting. Best
performing technology are expected to be introduced in a new high-performance profile of
current ISO/IEC 23092 standard series.
Widening support for storage and delivery of MPEG-5 EVC
At its 131st meeting, WG11 (MPEG) widened the support for storage and delivery of MPEG-5
Essential Video Coding (EVC; ISO/IEC 23094-1).
7. 1. One of the oldest but most popular MPEG standards for content delivery, MPEG-2
Systems (ISO/IEC 13818-1) is adding support for EVC. WG11 (MPEG) promoted the 3rd
amendment to the 2019 edition of the MPEG-2 Systems standard to the Committee
Draft of Amendment stage, the first milestone of the ISO standard development
process. It is entitled Carriage of EVC in MPEG-2 TS and update of the MPEG-H 3D
Audio descriptor and provides a definition all of the necessary descriptors and T-STD
model extension to carry MPEG-5 EVC elementary streams.
2. Recognizing the use of video coding standards for still picture applications is rapidly
growing in the market, WG11 (MPEG) promoted the 3rd amendment to the Image File
Format to the Committee Draft of Amendment stage, the first milestone of ISO
standard development process. It is entitled Support for VVC, EVC, slideshows and
other improvements and includes support of the most advanced video coding
standard, Versatile Video Coding (VVC), as well to provide a complete list of choices
to the markets whose requirements vary widely.
It is currently expected that both standards will reach its final milestone by the mid 2021.
Multi-Image Application Format adds support of HDR
Within less than two years after it has reached its last milestone of standard developments
the Multi-Image Application Format (MIAF; ISO/IEC 23000-22) has become the default format
for the storage of stillpictures within the smart phones. However, it lacks with support of one
of the killer features for image quality enhancement, i.e., High Dynamic Range (HDR). To
quickly answer such market needs, WG11 (MPEG) has promoted the 2nd Amendment to the
Multi-Image Application Format, MIAF HEVC Advanced HDR profile and other clarifications,
its first milestone of ISO standard development process. This amendment adds support of use
of PQ (Perceptual Quantizer) and HLG (Hybrid Log Gamma) color transfer characteristics and
P3 mastering display color volume properties with D65 white point for HEVC encoded still
pictures to support widely used HDR technologies. It is currently expected that the standard
will reach its final milestone by the mid 2021.
Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
At its 131st meeting, WG11 (MPEG) has promoted the carriageof Geometry-based point cloud
data (ISO/IEC 23090-18) to the Committee Draft stage, the first milestone of ISO standard
development process. This standard is the second standard introducing the support of
volumetric media in the industry-famous ISO base media file format (ISOBMFF) family of
standards after the standard on the carriage of video-based point cloud data (ISO/IEC 23090-
10). This standard (i.e., ISO/IEC 23090-18) supports the carriage of point cloud data within
multiple file format tracks in order to support individual access of each attributes comprising
a single point cloud. Additionally, it also allows the carriage of point cloud data in one file
format track for simple applications. Understanding the point cloud data could cover large
geographical area and the size of the data could be massive in some application the standard
support 3D region-based partial access of the data stored in the file so that the application
can efficiently access the portion of data required to be processed. It is currently expected
that the standard will reach its final milestone by the mid 2021.
8. MPEG Immersive Video (MIV) progresses to Committee Draft
At the 131st MPEG meeting, it was decided to output the committee draft of ISO/IEC 23090-
12 MPEG Immersive Video. The name was changed from “Immersive Video” to “MPEG
Immersive Video” (MIV), to clearly differentiate from other uses of the term “Immersive
Video” in general parlance. MIV supports compression of immersive video content, in which
a real or virtual 3D scene is captured by multiple real or virtual cameras. The use of this
standard enables storage and distribution of immersive video content over existing and
future networks, for playback with 6 degrees of freedom of view position and orientation.
Neural Network Compression for Multimedia Applications – WG11 (MPEG) progresses to
Committee Draft
Artificialneural networks have been adopted for a broad range of tasks in multimedia analysis
and processing,such as visualand acousticclassification, extraction of multimedia descriptors
or image and video coding. The trained neural networks for these applications contain a large
number of parameters (i.e., weights), resulting in a considerable size. Thus, transferring them
to a number of clients using them in applications (e.g., mobile phones, smart cameras)
requires compressed representation of neural networks.
WG11 (MPEG) has completed the CD of the specification at its 131st meeting. Considering the
fact that the compression of neural networks is likely to have a hardware dependent and
hardware independent component, the standard is designed as a toolbox of compression
technologies. The specification contains different parameter sparsification, parameter
reduction (e.g., matrix decomposition), parameter quantization, and entropy coding
methods, that can be assembled to encoding pipelines combining one or more (in the case of
sparsification/reduction) methods from each group. The results show that trained neural
networks for many common multimedia problems such as image or audio classification or
image compression can be compressed to 10% of their original size with no or very small
performance loss, and even significantly more at small performance loss. The specification is
independent of a particular neural network exchange format, and interoperability with
common formats is described in the annexes.
WG11 (MPEG) issues Committee Draft of Conformance and Reference Software for
Essential Video Coding (EVC)
At its 131st meeting, WG11 (MPEG) promoted the specification of the Conformance and
Reference Software for Essential Video Coding (ISO/IEC 23094-4) to Committee Draft (CD)
level. The Essential Video Coding (EVC) standard (ISO/IEC 23094-1) provides an improved
compression capability over existing video coding standards with timely publication of
licensing terms. The issued specification of the Conformance and Reference Software for
Essential Video Coding includes conformance bitstreams as well as a reference software for
the generation of those conformance bitstreams. This important standard will greatly help
industry achieve effective interoperability between products using EVC and provide valuable
information to ease the development of such products. The final specification is expected to
be available in early 2021.
9. Webinar: What’s new in MPEG?
MPEG cordially invites to its first webinar: What's new in MPEG? A brief update about the
results of its 131st MPEG meeting featuring:
Welcome and Introduction: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
Versatile Video Coding (VVC): Jens-Rainer Ohm and Gary Sullivan, JVET Chairs
MPEG 3D Audio: Schuyler Quackenbusch, MPEG Audio Chair
Video-based Point Cloud Compression (V-PCC): Marius, Preda, MPEG 3DG Chair
MPEG Immersive Video (MIV): Bart Kroon, MPEG Video BoG Chair
Carriage of Versatile Video Coding (VVC) and Enhanced Video Coding (EVC): Young-
Kwon Lim, MPEG Systems Chair
MPEG Roadmap: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
When: Tuesday, July 21, 2020, 10:00 UTC and 21:00 UTC (to accommodate different time
zones)
How: Please register here https://bit.ly/mpeg131. Q&A via sli.do
(https://app.sli.do/event/xpzpkhlm; event # 54597) starting from July 21, 2020.
How to contact WG 11 (MPEG) and Further Information
Journalists that wish to receive WG 11 (MPEG) Press Releases by email should contact Dr.
Christian Timmerer at christian.timmerer@itec.uni-klu.ac.at or
christian.timmerer@bitmovin.com or subscribe via
https://lists.aau.at/mailman/listinfo/mpeg-pr. For timely updates follow us on Twitter
(https://twitter.com/mpeggroup).
Future WG 11 (MPEG) meetings are planned as follows:
No. 132, Online, 12 – 16 October 2020
No. 133, Cape Town, ZA, 11 – 15 January 2021
No. 134, Geneva, CH, 26 – 30 April 2021
No. 135, Prague, CZ, 12 – 16 July 2021
For further information about WG 11 (MPEG), please contact:
Prof. Dr.-Ing. Jörn Ostermann (Convenor of WG 11 (MPEG), Germany)
Leibniz Universität Hannover
Appelstr. 9A
30167 Hannover, Germany
Tel: ++49 511 762 5316
Fax: ++49 511 762 5333
ostermann@tnt.uni-hannover.de
10. or
Priv.-Doz. Dr. Christian Timmerer
Alpen-Adria-Universität Klagenfurt | Bitmovin Inc.
9020 Klagenfurt am Wörthersee, Austria, Europe
Tel: +43 463 2700 3621
Email: christian.timmerer@itec.aau.at | christian.timmerer@bitmovin.com