Until recently, QoE (Quality of Experience) experiments had to be conducted in academic laboratories; however, with the advent of ubiquitous Internet access, it is now possible to ask an Internet crowd to conduct experiments on their personal computers. Since such a crowd can be quite large, crowdsourcing enables researchers to conduct experiments with a more diverse set of participants at a lower economic cost than would be possible under laboratory conditions. However, because participants carry out experiments without supervision, they may give erroneous feedback perfunctorily, carelessly, or dishonestly, even if they receive a reward for each experiment.
In this paper, we propose a crowdsourceable framework to quantify the QoE of multimedia content. The advantages of our framework over traditional MOS ratings are: 1) it enables crowdsourcing because it supports systematic verification of participants’ inputs; 2) the rating procedure is simpler than that of MOS, so there is less burden on participants; and 3) it derives interval-scale scores that enable subsequent quantitative analysis and QoE provisioning. We conducted four case studies, which demonstrated that, with our framework, researchers can outsource their QoE evaluation experiments to an Internet crowd without risking the quality of the results; and at the same time, obtain a higher level of participant diversity at a lower monetary cost.
On QoE Metrics and QoE Fairness for Network & Traffic ManagementTobias Hoßfeld
*** QoE Metrics ***
While Quality of Experience (QoE) has advanced very significantly as a field in recent years, the methods used for analyzing it have not always kept pace. When QoE is studied, measured or estimated, practically all the literature deals with Mean Opinion Scores (MOS). The MOS provides a simple scalar value for QoE, but it has several limitations, some of which are made clear in its name: for many applications, just having a mean value is not sufficient. For service and content providers in particular, it is more interesting to have an idea of how the scores are distributed, so as to ensure that a certain portion of the user population is experiencing satisfactory levels of quality, thus reducing churn. We put forward the limitations of MOS, present other statistical tools that provide a much more comprehensive view of how quality is perceived by the users, and illustrate it all by analyzing the results of several subjective studies with these tools.
*** QoE Fairness ***
User-centric service and application management focuses on the Quality of Experience (QoE) as perceived by the end user. Thereby, the goal is to maximize QoE while ensuring fairness among users, e.g., for resource allocation and scheduling in shared systems. Although the literature suggests to consider consequently QoE fairness, there is currently no accepted definition of QoE fairness. The contribution of this paper is the definition of a generic QoE fairness index F which has desirable key properties as well as the rationale behind it. By using examples and a measurement study involving multiple users downloading web content over a bottleneck link, we differentiate the proposed index from QoS fairness and the widely used Jain’s fairness index. Based on results, we argue that neither QoS fairness nor Jain’s fairness index meet all of the desirable QoE-relevant properties which are met by F. Consequently, the proposed index F may be used to compare QoE fairness across systems and applications, thus serving as a benchmark for QoE management mechanisms and system optimization.
Quality of Experience (QoE) has received much attention over the past years and is defined as is the degree of delight or annoyance of the user of an application or service. For various stakeholders QoE has become a prominent issue for delivering services and applications. A significant amount of research has been devoted to understanding, measuring, and modeling QoE for a variety of media services. In this talk, emerging challenges and concepts are discussed for managing QoE for networked media services. The following topics are addressed: multi-factor QoE modeling, QoE metrics and QoE fairness for QoE management, as well as recent efforts for web QoE monitoring. Finally, QoE++ is postulated, the evolution from the QoE ego-system towards the QoE eco-systems.
Correlating objective factors with videoIJCNCJournal
To succeed in providing services, the quality of services should meet users’ satisfaction. This is a motivation to study the relationship between the service quality and the real perceived quality of users, which is commonly referred to as the quality of experience (QoE). However, most of existing QoE studies that focus on video-on-demand or IPTV services analyze only the influence of network behaviors to video quality. This paper focuses on P2P video streaming services, which are becoming a significant portion of Internet traffic, and pays attention to the change of users’ perception with the adjustment of objective
factors as well as network behaviors. We propose to use mean opinion score and peak signal to noise ratio
methods as QoE evaluations to consider the effect of the chunk loss ratio, the group-of-picture size, and the
chunk size. The experimental results provide a convincing reference to build the complete relationship
between objective factors and QoE. We believe that this assessment will contribute to study a new service
quality evaluation mechanism based on users’ satisfaction in the future.
OneClick: A Framework for Measuring Network Quality of ExperienceAcademia Sinica
As the service requirements of network applications shift from high throughput to high media quality, interactivity, and responsiveness, the definition of QoE (Quality of Experience) has become multidimensional. Although it may not be difficult to measure individual dimensions of the QoE, how to capture users’ overall perceptions when they are using network applications remains an open question.
In this paper, we propose a framework called OneClick to capture users’ perceptions when they are using network applications. The framework only requires a subject to click a dedicated key whenever he/she feels dissatisfied with the quality of the application in use. OneClick is particularly effective because it is intuitive, lightweight, efficient, time-aware, and application-independent. We use two objective quality assessment methods, PESQ and VQM, to validate OneClick’s ability to evaluate the quality of audio and video clips. To demonstrate the proposed framework’s efficiency and effectiveness in assessing user experiences, we implement it on two applications, one for instant messaging applications, and the other for firstperson shooter games. A Flash implementation of the proposed framework is also presented.
On QoE Metrics and QoE Fairness for Network & Traffic ManagementTobias Hoßfeld
*** QoE Metrics ***
While Quality of Experience (QoE) has advanced very significantly as a field in recent years, the methods used for analyzing it have not always kept pace. When QoE is studied, measured or estimated, practically all the literature deals with Mean Opinion Scores (MOS). The MOS provides a simple scalar value for QoE, but it has several limitations, some of which are made clear in its name: for many applications, just having a mean value is not sufficient. For service and content providers in particular, it is more interesting to have an idea of how the scores are distributed, so as to ensure that a certain portion of the user population is experiencing satisfactory levels of quality, thus reducing churn. We put forward the limitations of MOS, present other statistical tools that provide a much more comprehensive view of how quality is perceived by the users, and illustrate it all by analyzing the results of several subjective studies with these tools.
*** QoE Fairness ***
User-centric service and application management focuses on the Quality of Experience (QoE) as perceived by the end user. Thereby, the goal is to maximize QoE while ensuring fairness among users, e.g., for resource allocation and scheduling in shared systems. Although the literature suggests to consider consequently QoE fairness, there is currently no accepted definition of QoE fairness. The contribution of this paper is the definition of a generic QoE fairness index F which has desirable key properties as well as the rationale behind it. By using examples and a measurement study involving multiple users downloading web content over a bottleneck link, we differentiate the proposed index from QoS fairness and the widely used Jain’s fairness index. Based on results, we argue that neither QoS fairness nor Jain’s fairness index meet all of the desirable QoE-relevant properties which are met by F. Consequently, the proposed index F may be used to compare QoE fairness across systems and applications, thus serving as a benchmark for QoE management mechanisms and system optimization.
Quality of Experience (QoE) has received much attention over the past years and is defined as is the degree of delight or annoyance of the user of an application or service. For various stakeholders QoE has become a prominent issue for delivering services and applications. A significant amount of research has been devoted to understanding, measuring, and modeling QoE for a variety of media services. In this talk, emerging challenges and concepts are discussed for managing QoE for networked media services. The following topics are addressed: multi-factor QoE modeling, QoE metrics and QoE fairness for QoE management, as well as recent efforts for web QoE monitoring. Finally, QoE++ is postulated, the evolution from the QoE ego-system towards the QoE eco-systems.
Correlating objective factors with videoIJCNCJournal
To succeed in providing services, the quality of services should meet users’ satisfaction. This is a motivation to study the relationship between the service quality and the real perceived quality of users, which is commonly referred to as the quality of experience (QoE). However, most of existing QoE studies that focus on video-on-demand or IPTV services analyze only the influence of network behaviors to video quality. This paper focuses on P2P video streaming services, which are becoming a significant portion of Internet traffic, and pays attention to the change of users’ perception with the adjustment of objective
factors as well as network behaviors. We propose to use mean opinion score and peak signal to noise ratio
methods as QoE evaluations to consider the effect of the chunk loss ratio, the group-of-picture size, and the
chunk size. The experimental results provide a convincing reference to build the complete relationship
between objective factors and QoE. We believe that this assessment will contribute to study a new service
quality evaluation mechanism based on users’ satisfaction in the future.
OneClick: A Framework for Measuring Network Quality of ExperienceAcademia Sinica
As the service requirements of network applications shift from high throughput to high media quality, interactivity, and responsiveness, the definition of QoE (Quality of Experience) has become multidimensional. Although it may not be difficult to measure individual dimensions of the QoE, how to capture users’ overall perceptions when they are using network applications remains an open question.
In this paper, we propose a framework called OneClick to capture users’ perceptions when they are using network applications. The framework only requires a subject to click a dedicated key whenever he/she feels dissatisfied with the quality of the application in use. OneClick is particularly effective because it is intuitive, lightweight, efficient, time-aware, and application-independent. We use two objective quality assessment methods, PESQ and VQM, to validate OneClick’s ability to evaluate the quality of audio and video clips. To demonstrate the proposed framework’s efficiency and effectiveness in assessing user experiences, we implement it on two applications, one for instant messaging applications, and the other for firstperson shooter games. A Flash implementation of the proposed framework is also presented.
Consistent high quality is the hallmark of a reliable localization company. Irrespective of deadlines and volumes, translations should be made professionally and thoroughly - they should be relevant, full and coherent. Language quality assurance is therefore one of the key processes to ensure the high quality of deliverables.
The material of the presentation is the content of new ASTM (www.astm.org) work item WK46397, which is currently in the process of discussion by ASTM working group F43 (Committee on language services and products).
Tutorial given at LAK13 conference, Leuven, April, 9th, 2013. The presentation is informed by WP2 of the LinkedUp-project.eu that develops an Evaluation Framework for Open Web Data (Linked Data) Applications for Education purposes.
Over last 15 years, Quality of Experience (QoE) has evolved from a buzzword to a holistic, mature scientific concept that captures the entire experience that a person has with a multimedia communication service (e.g. online video, web browsing, telephony, etc.). This talk provides an introduction to the concept of QoE and its operationalization in subjective experiments. To this end we first review the origins of QoE as well as the most useful definitions and frameworks that map the main QoE constituents and use cases. In the second part we go about operationalizing QoE, with a focus on how to design and conduct subjective QoE experiments that provide valid and reliable results.
Games on demand, a.k.a., cloud gaming, refers to a new way to deliver computer games to users, where computationally complex games are executed and rendered on powerful cloud servers rather than local computing devices. In this talk, I will give an overview of the challenges in developing cloud gaming systems, what we have done, and what remains to do. I will start from GamingAnywhere, an open-source cloud gaming system, followed by a number of studies based on the system. Finally I will conclude the talk with open issues in providing highly real-time and high-definition audio/visual quality multimedia experience (e.g., in the form of gaming and virtual reality).
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Academia Sinica
In this paper, we propose to use a continuous authentication approach to detect the in-situ identity fraud incidents, which occur when the attackers use the same devices and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users’ browsing behavior. Our experiment results demonstrate that the approach can achieve reasonable accuracy given a few minutes of observation time.
Cloud Gaming Onward: Research Opportunities and OutlookAcademia Sinica
Cloud gaming has become increasingly more popular in the academia and the industry, evident by the large numbers of related research papers and startup companies. Some public cloud gaming services have attracted hundreds of thousands subscribers, demonstrating the initial success of cloud gaming services. Pushing the cloud gaming services forward, however, faces various challenges, which open up many research opportunities. In this paper, we share our views on the future cloud gaming research, and point out several research problems spanning over a wide spectrum of different directions: including distributed systems, video codecs, virtualization, human-computer interaction, quality of experience, resource allocation, and dynamic adaptation. Solving these research problems will allow service providers to offer high-quality cloud gaming services yet remain profitable, which in turn results in even more successful cloud gaming eco-environment. In addition, we believe there will be many more novel ideas to capitalize the abundant and elastic cloud resources for better gaming experience, and we will see these ideas and associated challenges in the years to come.
Quantifying User Satisfaction in Mobile Cloud GamesAcademia Sinica
We conduct real experiments to quantify user satisfaction in mobile cloud games using a real cloud gaming system built on the open-sourced GamingAnywhere. We share our experiences in porting GamingAnywhere client to Android OS and perform extensive experiments on both the mobile and desktop clients. The experiment results reveal several new insights: (1) gamers are more satisfied with the graphics quality on mobile devices, while they are more satisfied with the control quality on desktops, (2) the bitrate, frame rate, and network delay significantly affect the graphics and smoothness quality, and (3) the control quality only depends on the client type (mobile versus desktop). To the best of our knowledge, such user studies have never been done in the literature.
Although online games have been an important Internet activity today, players inevitably suffer from lag from time to time due to the Internet’s non-QoS-guaranteed architecture. Here by lag we refer to the phenomena when a game fails to respond to user commands or update the screen in a timely fashion due to long system processing or network delays. Currently, little is known about how game players feel about lag and how they react when encountering lag during game play.
In this paper, we present an Internet survey that is designed to understand the following questions: 1) How do players perceive lag, 2) what do players think of the causes of lag, and 3) how do players react to lag. Our results show that game players often struggle with lag, because they are unable to identify the root cause. Therefore, they have to try any combination of possible solutions found on the Internet, blame game companies, or learn to cope. These findings manifest a strong demand for an automatic diagnostic tool that can identify the root cause of lag for gamers.
Understanding The Performance of Thin-Client GamingAcademia Sinica
The thin-client model is considered a perfect fit for online gaming. As modern games normally require tremendous computing and rendering power at the game client, deploying games with such models can transfer the burden of hardware upgrades from players to game operators. As a result, there are a variety of solutions proposed for thin-client gaming today. However, little is known about the performance of such thinclient systems in different scenarios, and there is no systematic means yet to conduct such analysis.
In this paper, we propose a methodology for quantifying the performance of thin-clients on gaming, even for thin-clients which are close-sourced. Taking a classic game, Ms. Pac-Man, and three popular thin-clients, LogMeIn, TeamViewer, and UltraVNC, as examples, we perform a demonstration study and determine that 1) display frame rate and frame distortion are both critical to gaming; and 2) different thin-client implementations may have very different levels of robustness against network impairments. Generally, LogMeIn performs best when network conditions are reasonably good, while TeamViewer and UltraVNC are the better choices under certain network conditions.
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkAcademia Sinica
Despite all the efforts devoted to improving the QoS of networked multimedia services, the baseline for such improvements has yet to be defined. In other words, although it is well recognized that better network conditions generally yield better service quality, the exact minimum level of network QoS required to ensure satisfactory user experience remains an open question.
In this paper, we propose a general, cheat-proof framework that enables researchers to systematically quantify the minimum QoS needs for real-time networked multimedia services. Our framework has two major features: 1) it measures the quality of a service that users find intolerable by intuitive responses and therefore reduces the burden on experiment participants; and 2) it is cheat-proof because it supports systematic verification of the participants' inputs. Via a pilot study involving 38 participants, we verify the efficacy of our framework by proving that even inexperienced participants can easily produce consistent judgments. In addition, by cross-application and cross-service comparative analysis, we demonstrate the usefulness of the derived QoS thresholds. Such knowledge will serve important reference in the evaluation of competitive applications, application recommendation, network planning, and resource arbitration.
Online Game QoE Evaluation using Paired ComparisonsAcademia Sinica
To satisfy players' gaming experience, there is a strong need for a technique that can measure a game's quality systemically, efficiently, and reliably. In this paper, we propose to use paired comparisons and probabilistic choice models to quantify online games's QoE under various network situations. The advantages of our methodology over the traditional MOS ratings are 1) the rating procedure is simpler thus less burden is on experiment participants, 2) it derives ratio-scale scores, and 3) it enables systematic verification of participants' inputs.
As a demonstration, we apply our methodology to evaluate three popular FPS (first-person-shooter) games, namely, Alien Arena (Alien), Halo, and Unreal Tournament (UT), and investigate their network robustness. The results indicate that Halo performs the best in terms of their network robustness against packet delay and loss. However, if we take the degree of the games' sophistication into account, we consider that the robustness of UT against downlink delays should able be improved. We also show that our methodology can be a helpful tool for making decisions about design alternatives, such how dead reckoning algorithms and time synchronization mechanisms should be implemented.
Cloud gaming is a promising application of the rapidly expanding cloud computing infrastructure. Existing cloud gaming systems, however, are closed-source with proprietary protocols, which raises the bars to setting up testbeds for experiencing cloud games. In this paper, we present a complete cloud gaming system, called GamingAnywhere, which is to the best of our knowledge the first open cloud gaming system. In addition to its openness, we design GamingAnywhere for high extensibility, portability, and reconfigurability. We implement GamingAnywhere on Windows, Linux, and OS X, while its client can be readily ported to other OS's, including iOS and Android. We conduct extensive experiments to evaluate the performance of GamingAnywhere, and compare it against two well-known cloud gaming systems: OnLive and StreamMyGame. Our experimental results indicate that GamingAnywhere is efficient and provides high responsiveness and video quality. For example, GamingAnywhere yields a per-frame processing delay of 34 ms, which is 3+ and 10+ times shorter than OnLive and StreamMyGame, respectively. Our experiments also reveal that all these performance gains are achieved without the expense of higher network loads; in fact, GamingAnywhere incurs less network traffic. The proposed GamingAnywhere can be employed by the researchers, game developers, service providers, and end users for setting up cloud gaming testbeds, which, we believe, will stimulate more research innovations on cloud gaming systems.
GamingAnywhere is now publicly available at http://gaminganywhere.org.
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAcademia Sinica
Cloud gaming makes any computer game playable on a thin client without worrying the hardware requirements as before. It frees players from the need to constantly upgrade their computers as they can now play games that host on remote servers with a broadband Internet connection and a thin client. However, cloud games are intrinsically more susceptible to latency than online games because game graphics are rendered on server nodes and thin clients do not possess game state information that is required by delay compensation techniques.
In this paper, we investigate how the response latency would affect users' experience when playing games on clouds and how the impact of latency on players' experience varies across different games. We show that not all games are equally friendly to cloud gaming. The same degree of latency may have very different impact on a game's quality of experience depending on the game's real-time strictness. We thus develop a model that can predict a game's real-time strictness based on the rate of players' inputs and game screen dynamics. The model can be used to enhance players' experience in cloud gaming and optimize data center operation cost simultaneously.
More Related Content
Similar to A Crowdsourceable QoE Evaluation Framework for Multimedia Content
Consistent high quality is the hallmark of a reliable localization company. Irrespective of deadlines and volumes, translations should be made professionally and thoroughly - they should be relevant, full and coherent. Language quality assurance is therefore one of the key processes to ensure the high quality of deliverables.
The material of the presentation is the content of new ASTM (www.astm.org) work item WK46397, which is currently in the process of discussion by ASTM working group F43 (Committee on language services and products).
Tutorial given at LAK13 conference, Leuven, April, 9th, 2013. The presentation is informed by WP2 of the LinkedUp-project.eu that develops an Evaluation Framework for Open Web Data (Linked Data) Applications for Education purposes.
Over last 15 years, Quality of Experience (QoE) has evolved from a buzzword to a holistic, mature scientific concept that captures the entire experience that a person has with a multimedia communication service (e.g. online video, web browsing, telephony, etc.). This talk provides an introduction to the concept of QoE and its operationalization in subjective experiments. To this end we first review the origins of QoE as well as the most useful definitions and frameworks that map the main QoE constituents and use cases. In the second part we go about operationalizing QoE, with a focus on how to design and conduct subjective QoE experiments that provide valid and reliable results.
Games on demand, a.k.a., cloud gaming, refers to a new way to deliver computer games to users, where computationally complex games are executed and rendered on powerful cloud servers rather than local computing devices. In this talk, I will give an overview of the challenges in developing cloud gaming systems, what we have done, and what remains to do. I will start from GamingAnywhere, an open-source cloud gaming system, followed by a number of studies based on the system. Finally I will conclude the talk with open issues in providing highly real-time and high-definition audio/visual quality multimedia experience (e.g., in the form of gaming and virtual reality).
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Academia Sinica
In this paper, we propose to use a continuous authentication approach to detect the in-situ identity fraud incidents, which occur when the attackers use the same devices and IP addresses as the victims. Using Facebook as a case study, we show that it is possible to detect such incidents by analyzing SNS users’ browsing behavior. Our experiment results demonstrate that the approach can achieve reasonable accuracy given a few minutes of observation time.
Cloud Gaming Onward: Research Opportunities and OutlookAcademia Sinica
Cloud gaming has become increasingly more popular in the academia and the industry, evident by the large numbers of related research papers and startup companies. Some public cloud gaming services have attracted hundreds of thousands subscribers, demonstrating the initial success of cloud gaming services. Pushing the cloud gaming services forward, however, faces various challenges, which open up many research opportunities. In this paper, we share our views on the future cloud gaming research, and point out several research problems spanning over a wide spectrum of different directions: including distributed systems, video codecs, virtualization, human-computer interaction, quality of experience, resource allocation, and dynamic adaptation. Solving these research problems will allow service providers to offer high-quality cloud gaming services yet remain profitable, which in turn results in even more successful cloud gaming eco-environment. In addition, we believe there will be many more novel ideas to capitalize the abundant and elastic cloud resources for better gaming experience, and we will see these ideas and associated challenges in the years to come.
Quantifying User Satisfaction in Mobile Cloud GamesAcademia Sinica
We conduct real experiments to quantify user satisfaction in mobile cloud games using a real cloud gaming system built on the open-sourced GamingAnywhere. We share our experiences in porting GamingAnywhere client to Android OS and perform extensive experiments on both the mobile and desktop clients. The experiment results reveal several new insights: (1) gamers are more satisfied with the graphics quality on mobile devices, while they are more satisfied with the control quality on desktops, (2) the bitrate, frame rate, and network delay significantly affect the graphics and smoothness quality, and (3) the control quality only depends on the client type (mobile versus desktop). To the best of our knowledge, such user studies have never been done in the literature.
Although online games have been an important Internet activity today, players inevitably suffer from lag from time to time due to the Internet’s non-QoS-guaranteed architecture. Here by lag we refer to the phenomena when a game fails to respond to user commands or update the screen in a timely fashion due to long system processing or network delays. Currently, little is known about how game players feel about lag and how they react when encountering lag during game play.
In this paper, we present an Internet survey that is designed to understand the following questions: 1) How do players perceive lag, 2) what do players think of the causes of lag, and 3) how do players react to lag. Our results show that game players often struggle with lag, because they are unable to identify the root cause. Therefore, they have to try any combination of possible solutions found on the Internet, blame game companies, or learn to cope. These findings manifest a strong demand for an automatic diagnostic tool that can identify the root cause of lag for gamers.
Understanding The Performance of Thin-Client GamingAcademia Sinica
The thin-client model is considered a perfect fit for online gaming. As modern games normally require tremendous computing and rendering power at the game client, deploying games with such models can transfer the burden of hardware upgrades from players to game operators. As a result, there are a variety of solutions proposed for thin-client gaming today. However, little is known about the performance of such thinclient systems in different scenarios, and there is no systematic means yet to conduct such analysis.
In this paper, we propose a methodology for quantifying the performance of thin-clients on gaming, even for thin-clients which are close-sourced. Taking a classic game, Ms. Pac-Man, and three popular thin-clients, LogMeIn, TeamViewer, and UltraVNC, as examples, we perform a demonstration study and determine that 1) display frame rate and frame distortion are both critical to gaming; and 2) different thin-client implementations may have very different levels of robustness against network impairments. Generally, LogMeIn performs best when network conditions are reasonably good, while TeamViewer and UltraVNC are the better choices under certain network conditions.
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkAcademia Sinica
Despite all the efforts devoted to improving the QoS of networked multimedia services, the baseline for such improvements has yet to be defined. In other words, although it is well recognized that better network conditions generally yield better service quality, the exact minimum level of network QoS required to ensure satisfactory user experience remains an open question.
In this paper, we propose a general, cheat-proof framework that enables researchers to systematically quantify the minimum QoS needs for real-time networked multimedia services. Our framework has two major features: 1) it measures the quality of a service that users find intolerable by intuitive responses and therefore reduces the burden on experiment participants; and 2) it is cheat-proof because it supports systematic verification of the participants' inputs. Via a pilot study involving 38 participants, we verify the efficacy of our framework by proving that even inexperienced participants can easily produce consistent judgments. In addition, by cross-application and cross-service comparative analysis, we demonstrate the usefulness of the derived QoS thresholds. Such knowledge will serve important reference in the evaluation of competitive applications, application recommendation, network planning, and resource arbitration.
Online Game QoE Evaluation using Paired ComparisonsAcademia Sinica
To satisfy players' gaming experience, there is a strong need for a technique that can measure a game's quality systemically, efficiently, and reliably. In this paper, we propose to use paired comparisons and probabilistic choice models to quantify online games's QoE under various network situations. The advantages of our methodology over the traditional MOS ratings are 1) the rating procedure is simpler thus less burden is on experiment participants, 2) it derives ratio-scale scores, and 3) it enables systematic verification of participants' inputs.
As a demonstration, we apply our methodology to evaluate three popular FPS (first-person-shooter) games, namely, Alien Arena (Alien), Halo, and Unreal Tournament (UT), and investigate their network robustness. The results indicate that Halo performs the best in terms of their network robustness against packet delay and loss. However, if we take the degree of the games' sophistication into account, we consider that the robustness of UT against downlink delays should able be improved. We also show that our methodology can be a helpful tool for making decisions about design alternatives, such how dead reckoning algorithms and time synchronization mechanisms should be implemented.
Cloud gaming is a promising application of the rapidly expanding cloud computing infrastructure. Existing cloud gaming systems, however, are closed-source with proprietary protocols, which raises the bars to setting up testbeds for experiencing cloud games. In this paper, we present a complete cloud gaming system, called GamingAnywhere, which is to the best of our knowledge the first open cloud gaming system. In addition to its openness, we design GamingAnywhere for high extensibility, portability, and reconfigurability. We implement GamingAnywhere on Windows, Linux, and OS X, while its client can be readily ported to other OS's, including iOS and Android. We conduct extensive experiments to evaluate the performance of GamingAnywhere, and compare it against two well-known cloud gaming systems: OnLive and StreamMyGame. Our experimental results indicate that GamingAnywhere is efficient and provides high responsiveness and video quality. For example, GamingAnywhere yields a per-frame processing delay of 34 ms, which is 3+ and 10+ times shorter than OnLive and StreamMyGame, respectively. Our experiments also reveal that all these performance gains are achieved without the expense of higher network loads; in fact, GamingAnywhere incurs less network traffic. The proposed GamingAnywhere can be employed by the researchers, game developers, service providers, and end users for setting up cloud gaming testbeds, which, we believe, will stimulate more research innovations on cloud gaming systems.
GamingAnywhere is now publicly available at http://gaminganywhere.org.
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAcademia Sinica
Cloud gaming makes any computer game playable on a thin client without worrying the hardware requirements as before. It frees players from the need to constantly upgrade their computers as they can now play games that host on remote servers with a broadband Internet connection and a thin client. However, cloud games are intrinsically more susceptible to latency than online games because game graphics are rendered on server nodes and thin clients do not possess game state information that is required by delay compensation techniques.
In this paper, we investigate how the response latency would affect users' experience when playing games on clouds and how the impact of latency on players' experience varies across different games. We show that not all games are equally friendly to cloud gaming. The same degree of latency may have very different impact on a game's quality of experience depending on the game's real-time strictness. We thus develop a model that can predict a game's real-time strictness based on the rate of players' inputs and game screen dynamics. The model can be used to enhance players' experience in cloud gaming and optimize data center operation cost simultaneously.
Online gaming has now become an extremely com- petitive business. As there are so many game titles released every month, gamers have become more difficult to please and fickle in affection. Therefore, it would be beneficial if we can forecast how addictive a game is before publishing it on the market. The capability of game addictiveness forecasting will enable developers to continuously adjust the game design and enable publishers to assess the potential market value in a game’s early development stages.
In this paper, we propose to forecast a game’s addictiveness based on players’ emotion when they are exploring the game. Based on the account activity traces of 11 commercial online games, we develop a forecasting model that takes electromyo- graphic measures of players as the input and outputs the addic- tiveness index of a game. We hope that with our methodology, the game industry could save hopeless investment and target more accurately to provide more entertaining experience.
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesAcademia Sinica
Peer-to-peer relaying is commonly used in realtime applications to cope with NAT and firewall restrictions and provide better quality network paths. As relaying is not natively supported by the Internet, it is usually implemented at the application layer. Also, in a modern operating system, the processor is shared, so the receive-process-forward process for each relay packet may take a considerable amount of time if the host is busy handling some other tasks. Thus, if we happen to select a loaded relay node, the relaying may introduce significant delays to the packet transmission time and even degrade the application performance.
In this work, based on an extensive set of Internet traces, we pursue an understanding of the processing delays incurred at relay nodes and their impact on the application performance. Our contribution is three-fold: 1) we propose a methodology for measuring the processing delays at any relay node on the Internet; 2) we characterize the workload patterns of a variety of Internet relay nodes; and 3) we show that, serious VoIP quality degradation may occur due to relay processing, thus we have to monitor the processing delays of a relay node continuously to prevent the application performance from being degraded.
Inferring Speech Activity from Encrypted Skype TrafficAcademia Sinica
Normally, voice activity detection (VAD) refers to speech processing algorithms for detecting the presence or absence of human speech in segments of audio signals. In this paper, however, we focus on speech detection algorithms that take VoIP traffic instead of audio signals as input. We call this category of algorithms network-level VAD.
Traditional VAD usually plays a fundamental role in speech processing systems because of its ability to delimit speech segments. Network-level VAD, on the other hand, can be quite helpful in network management, which is the motivation for our study. We propose the first real-time network-level VAD algorithm that can extract voice activity from encrypted and non-silence-suppressed Skype traffic. We evaluate the speech detection accuracy of the proposed algorithm with extensive reallife traces. The results show that our scheme achieve reasonably good performance even high degree of randomness has been injected into the network traffic.
Game Bot Detection Based on Avatar TrajectoryAcademia Sinica
In recent years, online gaming has become one of the most popular Internet activities, but cheating activity, such as the use of game bots, has increased as a consequence. Generally, the gaming community disagrees with the use of game bots, as bot users obtain unreasonable rewards without corresponding efforts. However, bots are hard to detect because they are designed to simulate human game playing behavior and they follow game rules exactly. Existing detection approaches either interrupt the players’gaming experience, or they assume game bots are run as standalone clients or assigned a specific goal, such as aim bots in FPS games.
In this paper, we propose a trajectory-based approach to detect game bots. It is a general technique that can be applied to any game in which the avatar’s movement is controlled directly by the players. Through real-life data traces, we show that the trajectories of human players and those of game bots are very different. In addition, although game bots may endeavor to simulate players’ decisions, certain human behavior patterns are difficult to mimic because they are AI-hard. Taking Quake 2 as a case study, we evaluate our scheme’s performance based on reallife traces. The results show that the scheme can achieve a detection accuracy of 95% or higher given a trace of 200 seconds or longer.
A Collusion-Resistant Automation Scheme for Social Moderation SystemsAcademia Sinica
For current Web 2.0 services, manual examination of user uploaded content is normally required to ensure its legitimacy and appropriateness, which is a substantial burden to service providers. To reduce labor costs and the delays caused by content censoring, social moderation has been proposed as a front-line mechanism, whereby user moderators are encouraged to examine content before system moderation is required. Given the immerse amount of new content added to the Web each day, there is a need for automation schemes to facilitate rear system moderation. This kind of mechanism is expected to automatically summarize reports from user moderators and ban misbehaving users or remove inappropriate content whenever possible. However, the accuracy of such schemes may be reduced by collusion attacks, where some work together to mislead the automatic summarization in order to obtain shared benefits.
In this paper, we propose a collusion-resistant automation scheme for social moderation systems. Because some user moderators may collude and dishonestly claim that a user misbehaves, our scheme detects whether an accusation from a user moderator is fair or malicious based on the structure of mutual accusations of all users in the system. Through simulations we show that collusion attacks are likely to succeed if an intuitive countbased automation scheme is used. The proposed scheme, which is based on the community structure of the user accusation graph, achieves a decent performance in most scenarios.
Tuning Skype’s Redundancy Control Algorithm for User SatisfactionAcademia Sinica
Determining how to transport delay-sensitive voice data has long been a problem in multimedia networking. The difficulty arises because voice and best-effort data are different by nature. It would not be fair to give priority to voice traffic and starve its best-effort counterpart; however, the voice data delivered might not be perceptible if each voice call is limited to the rate of an average TCP flow. To address the problem, we approach it from a user-centric perspective by tuning the voice data rate based on user satisfaction.
Our contribution in this work is threefold. First, we investigate how Skype, the largest and fastest growing VoIP service on the Internet, adapts its voice data rate (i.e., the redundancy ratio) to network conditions. Second, by exploiting implementations of public domain codecs, we discover that Skype’s mechanism is not really geared to user satisfaction. Third, based on a set of systematic experiments that quantify user satisfaction under different levels of packet loss and burstiness, we derive a concise model that allows user-centric redundancy control. The model can be easily incorporated into general VoIP services (not only Skype) to ensure consistent user satisfaction.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
UiPath Test Automation using UiPath Test Suite series, part 5
A Crowdsourceable QoE Evaluation Framework for Multimedia Content
1. A Crowdsourceable
QoE Evaluation Framework for
Multimedia Content
Kuan‐Ta Chen Academia Sinica
Chen‐Chi Wu National Taiwan University
Yu‐Chun Chang National Taiwan University
Chin‐Laung Lei National Taiwan University
ACM Multimedia 2009
2. What is QoE?
Quality of Experience =
Users’ satisfaction
about a service
(e.g., multimedia content)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 2
3. Quality of Experience
Poor Good
(underexposed) (exposure OK)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 3
4. Challenges
How to quantify the QoE of multimedia content
efficiently and reliably?
Q=? Q=?
Q=? Q=?
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 4
5. Mean Opinion Score (MOS)
Idea: Single Stimulus Method (SSM) + Absolute Categorial
Rating (ACR)
MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying
Excellent?
Good?
Fair? vote
Poor?
Bad?
Fair
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 5
6. Drawbacks of MOS‐based Evaluations
ACR‐based
Concepts of the scales cannot be concretely defined
Dissimilar interpretations of the scale among users
Only an ordinal scale, not an interval scale Solve all
Difficult to verify users’ scores these
drawbacks
Subjective experiments in laboratory
Monetary cost (reward, transportation)
Labor cost (supervision)
Physical space/time/hardware constraints
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 6
7. Drawbacks of MOS‐based Evaluations
ACR‐based
Concepts of the scales cannot be concretely defined Paired
Dissimilar interpretations of the scale among users Comparison
Only an ordinal scale, not an interval scale
Difficult to verify users’ scores
Subjective experiments in laboratory
Monetary cost (reward, transportation)
Crowdsourcing
Labor cost (supervision)
Physical space/time/hardware constraints
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 7
9. Talk Progress
Overview
Methodology
Paired Comparison
Crowdsourcing Support
Experiment Design
Case Study & Evaluation
Acoustic QoE
Optical QoE
Conclusion
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 9
10. Current Approach: MOS Rating
Excellent?
Good?
Fair?
Poor? Vote
Bad? ?
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 10
11. Our Proposal: Paired Comparison
A B
Which one
is better?
Vote
B
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 11
12. Properties of Paired Comparison
Generalizable across different content types and
applications
Simple comparative judgment
dichotomous decision easier than 5‐category rating
Interval‐scale QoE scores can be inferred
The users’ inputs can be verified
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 12
13. Choice Frequency Matrix
10 experiments, each containing C(4,2)=6 paired comparisons
A B C D
A
0 9 10 9
B 1 0 7 8
C 0 3 0 6
D
1 2 4 0
14. Inference of QoE Scores
Bradley‐Terry‐Luce (BTL) model
input: choice frequency matrix
output: an interval‐scale score for each content
(based on maximum likelihood estimation)
n content: T1,…, Tn
Pij : the probability of choosing Ti over Tj
u(Ti) is the estimated QoE score of the quality level Ti
u (Ti ) − u (T j )
π (Ti ) e
Pij = =
π (Ti ) + π (T j ) 1 + eu (T ) −u (T )
i j
Basic Idea
P12 = P23 u(T1) - u(T2) = u(T2) - u(T3)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 14
15. Inferred QoE Scores
0 0.63 0.91 1
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 15
16. Talk Progress
Overview
Methodology
Paired Comparison
Crowdsourcing Support
Experiment Design
Case Study & Evaluation
Acoustic QoE
Optical QoE
Conclusion
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 16
17. Crowdsourcing
= Crowd + Outsourcing
“soliciting solutions via open calls
to large‐scale communities”
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 17
18. Image Understanding
Reward: 0.04 USD
main theme?
key objects?
unique attributes?
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 18
19. Linguistic Annotations
Word similarity (Snow et al. 2008)
USD 0.2 for labeling 30 word pairs
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 19
20. More Examples
Document relevance evaluation
Alonso et al. (2008)
Document rating collection
Kittur et al. (2008)
Noun compound paraphrasing
Nakov (2008)
Person name resoluation
Su et al. (2007)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 20
21. The Risk
Not every Internet user is trustworthy!
Users may give erroneous feedback perfunctorily,
carelessly, or dishonestly
Dishonest users have more incentives to perform tasks
Need to have an ONLINE algorithm to detect
problematic inputs!
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 21
22. Verification of Users’ Inputs (1)
Transitivity property
If A > B and B > C A should be > C
Transitivity Satisfaction Rate (TSR)
# of triples satisfy the transitivity rule
# of triples the transitivity rule may apply to
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 22
23. Verification of Users’ Inputs (2)
Detect inconsistent judgments from problematic
users
TSR = 1 perfect consistency
TSR >= 0.8 generally consistent
TSR < 0.8 judgments are inconsistent
TSR‐based reward / punishment
(e.g., only pay a reward if TSR > 0.8)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 23
24. Experiment Design
For n algorithms (e.g., speech encoding)
1. a source content as the evaluation target
2. apply the n algorithms to generate n content w/ different Q
⎛n⎞
3. ask a user to perform paired comparisons
⎜ ⎟
⎜ 2⎟
⎝ ⎠
4. compute TSR after an experiment
reward a user ONLY if his inputs are self‐consistent
(i.e., TSR is higher than a certain threshold)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 24
26. Audio QoE Evaluation
Which one is better?
(SPACE key released) (SPACE key pressed)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 26
27. Video QoE evaluation
Which one is better?
(SPACE key released) (SPACE key pressed)
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 27
28. Talk Progress
Overview
Methodology
Paired Comparison
Crowdsourcing Support
Experiment Design
Case Study & Evaluation
Acoustic QoE
Optical QoE
Conclusion
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 28
29. Audio QoE Evaluation
MP3 compression level
Source clips: one fast‐paced and one slow‐paced song
MP3 CBR format with 6 bit rate levels: 32, 48, 64, 80, 96, and
128 Kbps
127 participants and 3,660 paired comparisons
Effect of packet loss rate on VoIP
Two speech codecs: G722.1 and G728
Packet loss rate: 0%, 4%, and 8%
62 participants and 1,545 paired comparisons
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 29
30. Inferred QoE Scores
MP3 Compression Level VoIP Packet Loss Rate
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 30
31. Video QoE Evaluation
Video codec
Source clips: one fast‐paced and one slow‐paced video clip
Three codecs: H.264, WMV3, and XVID
Two bit rates: 400 and 800 Kbps
121 participants and 3,345 paired comparisons
Loss concealment scheme
Source clips: one fast‐paced and one slow‐paced video clip
Two schemes: Frame copy (FC) and FC with frame skip (FCFS)
Packet loss rate: 1%, 5%, and 8%
91 participants and 2,745 paired comparisons
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 31
32. Inferred QoE Scores
Video Codec Concealment Scheme
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 32
33. Participant Source
Laboratory
Recruit part‐time workers at an hourly rate of 8 USD
MTurk
Post experiments on the Mechanical Turk web site
Pay the participant 0.15 USD for each qualified experiment
Community
Seek participants on the website of an Internet community
with 1.5 million members
Pay the participant an amount of virtual currency that was
equivalent to one US cent for each qualified experiment
A Crowdsourceable QoE Evaluation Framework for Multimedia Content / Kuan‐Ta Chen 33
36. Conclusion
Crowdsourcing is not without limitations
physical contact
environment control
media
With paired comparison and user input verification,
less monetary cost
wider participant diversity
shorter experiment cycle
evaluation quality maintained
37. May the crowd Force
with you!
Thank You!
Kuan‐Ta Chen
Academia Sinica
ACM Multimedia 2009