Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

2,867 views

Published on

Crowdsourcing represents a significant source of data which needs to be analyzed and interpreted. These tasks influence the quality of the output as well as the efficiency of the process. Visualization proved to be an effective way of dealing with large amount of data. In this paper we propose a visualization analytic model in the context of the CrowdTruth framework and CrowdTruth metrics for optimizing the crowdsourcing process and improving its data quality. The requirements for the dynamic, scalable and interactive visualizations were extracted through literature and interviews with users of the framework.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

  1. 1. By Tatiana Cristea Supervised by Lora Aroyo (VU) & Robert-Jan Sips (IBM)
  2. 2. Visualizations for quality assessment of crowdsourced data Noisy Crowdsourced data Quality data
  3. 3.  Current practices: based on the consensus of workers  CrowdTruth metrics : considers disagreement informative
  4. 4. Select from the list the objects depicted in the image: Unclear image (content unit) Worker 1 Worker 2 Worker 3  Balloon  Flower  Human  Car  Ghost  Person  Balloon  Flower  Human  Car  Ghost  Person  Balloon  Flower  Human  Car  Ghost  Person Can you identify the low quality worker(s)?
  5. 5. Select from the list the objects depicted in the image: separable Worker 1  Balloon  Flower  Human Not clearly  Car answers  Ghost  Person Worker 2  Balloon  Flower  Human  Car  Ghost  Person Worker 3  Balloon  Flower  Human  Car  Ghost  Person Can you identify the low quality worker(s)?
  6. 6. Select from the list the objects depicted in the image: Worker 2  Balloon Worker 1  Balloon  Balloon  Flower workers  Human Low quality  Car  Ghost  Person  Flower  Human  Car  Ghost  Person Worker 3  Flower  Human  Car  Ghost  Person Can you identify the low quality workers?
  7. 7. How good is the unit for the specific task? How well the worker understood the task? Are the annotation options clear and separable? Unit Worker Annotation
  8. 8. JOB 1 JOB 2 Unit Worker Annotation Annotation JOB N Unit Unit Worker Worker Annotation
  9. 9. Visualization approach for quality assessment of crowdsourced data : a) at aggregate level b) at a specific level c) and in the context of their interdependencies
  10. 10.  Extracted through interviews  Visualization of properties, statistics and metrics of:  single job/unit/worker  collection of jobs/unit/workers  Functional requirements:  Filtering, sorting  Support for detection of outliers  Visualization of connected workers, content units and jobs  Support of comparative analysis  Support for navigation between connected elements, etc.
  11. 11.  DEMO TOUR
  12. 12.  We evaluated the design with 9 persons  Different levels of experience with crowdsourcing tasks
  13. 13.  useful in:  the assessment of quality  deep analysis of the data But….
  14. 14. The amount of information was a (little) bit overwhelming…
  15. 15. The interactions are great! … if you know about them 
  16. 16. The time dimension is not always present…
  17. 17.  Create user profiles  Decouple the visualization component and provide it as a separate plugin  Add the time dimension Time to the visualizations

×