Successfully reported this slideshow.

The Dynamics of Micro-Task Crowdsourcing

1

Share

Upcoming SlideShare
Crowd scheduling www2016
Crowd scheduling www2016
Loading in …3
×
1 of 41
1 of 41

The Dynamics of Micro-Task Crowdsourcing

1

Share

Download to read offline

Micro-task crowdsourcing is rapidly gaining popularity among research communities and businesses as a means to leverage Human Computation in their daily operations. Unlike any other service, a crowdsourcing platform is in fact a marketplace subject to human factors that affect its performance, both in terms of speed and quality. Indeed, such factors shape the dynamics of the crowdsourcing market. For example, a known behavior of such markets is that increasing the reward of a set of tasks would lead to faster results. However, it is still unclear how different dimensions interact with each other: reward, task type, market competition, requester reputation, etc.
In this paper, we adopt a data-driven approach to (A) perform a long-term analysis of a popular micro-task crowdsourcing platform and understand the evolution of its main actors (workers, requesters, and platform). (B) We leverage the main findings of our five year log analysis to propose features used in a predictive model aiming at determining the expected performance of any batch at a specific point in time. We show that the number of tasks left in a batch and how recent the batch is are two key features of the prediction. (C) Finally, we conduct an analysis of the demand (new tasks posted by the requesters) and supply (number of tasks completed by the workforce) and show how they affect task prices on the marketplace.

Micro-task crowdsourcing is rapidly gaining popularity among research communities and businesses as a means to leverage Human Computation in their daily operations. Unlike any other service, a crowdsourcing platform is in fact a marketplace subject to human factors that affect its performance, both in terms of speed and quality. Indeed, such factors shape the dynamics of the crowdsourcing market. For example, a known behavior of such markets is that increasing the reward of a set of tasks would lead to faster results. However, it is still unclear how different dimensions interact with each other: reward, task type, market competition, requester reputation, etc.
In this paper, we adopt a data-driven approach to (A) perform a long-term analysis of a popular micro-task crowdsourcing platform and understand the evolution of its main actors (workers, requesters, and platform). (B) We leverage the main findings of our five year log analysis to propose features used in a predictive model aiming at determining the expected performance of any batch at a specific point in time. We show that the number of tasks left in a batch and how recent the batch is are two key features of the prediction. (C) Finally, we conduct an analysis of the demand (new tasks posted by the requesters) and supply (number of tasks completed by the workforce) and show how they affect task prices on the marketplace.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

The Dynamics of Micro-Task Crowdsourcing

  1. 1. The Dynamics of Micro-Task Crowdsourcing The Case of Amazon MTurk Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panos Ipeirotis, Philippe Cudré-Mauroux WWW’15 - 20th May 2015 - Florence 1
  2. 2. Background Crowdsourcing is an Effective solution to certain classes of problems 2
  3. 3. Background A Crowdsourcing Platform allows requesters to publish a crowdsourcing request (batch) composed of multiple tasks (HITs) Programmatically Invoke the crowd with APIs 3
  4. 4. Background Paid Microtask Crowdsourcing scales-out but remains highly unpredictable 4
  5. 5. Background Paid Microtask Crowdsourcing scales-out but remains highly unpredictable 5 time #HITs/ Minute Batch Throughput
  6. 6. SLAs are expensive 6
  7. 7. MTurk is a Marketplace for HITs Direct: Price, Time of the day, #workers, #HITs etc Other: Forums, Reputation-sys (TurkOpticon), Recommendation-sys (Openturk) 7
  8. 8. A Data Driven Approach 8
  9. 9. 9
  10. 10. ...Five Years Later [2009 - 2014] mturk-tracker collected 2.5Million different batches with over 130Million HITs 10
  11. 11. mturk-tracker.com ● Collects metadata about each visible batch (Title, description, rewards, required qualifications, HITs available etc) ● Records batch progress (every ~20 minutes) We note that the tracker reports data periodically only and does not reflect fine-grained information (e.g., real-time variations) 11
  12. 12. Menu 1. Notable Facts Extracted from the Data 2. Large-scale HIT Type Classification 3. Analyzing the Features Affecting Batch Throughput 4. Market Analysis 12
  13. 13. 1) Notable Facts Extracted from the Data 13
  14. 14. Country-Specific HITs 14 US and India?
  15. 15. Country-Specific HITs Workers from US, India and Canada are the most sought after. 15
  16. 16. Distribution of Batch Size 16 “Power-law”
  17. 17. Evolution of Batch Sizes Very large batches start to appear 17
  18. 18. HIT Pricing 18 Is 1-cent per HIT the norm?
  19. 19. HIT Pricing 19 5-cents is the new 1-cent
  20. 20. Requesters and Reward Evolution 20 Increasing number of New and Distinct Requesters
  21. 21. 2) Large-scale HIT Type Classification 21
  22. 22. Classify HITs into types (Gadiraju et. al 2014) - Information Finding (IF) - Verification and Validation (VV ) - Interpretation and Analysis (IA) - Content Creation (CC) - Surveys (SU) - Content Access (CA) 22 HIT Classes
  23. 23. We trained a Support Vector Machine (SVM) model - HIT title, description, keywords, reward, date, allocated time, and batch size - Created labeled data on Mturk for 5,000 HITs uniformly sampled HITs - Our HIT used 3 repetitions - Consensus reached for 89% of the tasks - 10-fold cross validation - Precision of 0.895 - Recall of 0.899 - F-Measure of 0.895 - We then performed a large-scale classification for all 2.5M HITs Supervised Classification With the Crowd 23
  24. 24. Distribution of HIT Types Less Content Access batches Content Creation being the most popular 24
  25. 25. 3) Analyzing the Features Affecting Batch Throughput 25 time #HITs/ Minute Batch Throughput
  26. 26. Batch Throughput Prediction 29 Features HIT Features HITs available, Start Time, Reward, Description length, Title length, Keywords, requester_id, Time_alloted, Task type, Age (minutes) etc. Market Features Total HITs available, HITs arrived, rewards Arrived, % HITs completed etc. 26
  27. 27. Batch Throughput Prediction T time delta - Predict batch throughput at time T by training a Random Forest Regression model with samples taken in [T-delta, T) time span - 29 Features (including the Type of the Batch) - Hourly Data in range [June-October] 2014 - We sampled 50 times points for evaluation purposes 27
  28. 28. Batch Throughput Prediction T time delta - Predict batch throughput at time T by training a Random Forest Regression model with samples taken in [T-delta, T) time span - 29 Features (including the Type of the Batch) - Hourly Data in range [June-October] 2014 - We sampled 50 times points for evaluation purposes We are interested in cases where prediction works reasonably 28
  29. 29. Predicted vs. Actual Batch Throughput (delta=4 hours) Prediction Works best for larger batches having large momentum 29
  30. 30. Significant Features - What features contribute best when the prediction works reasonably - We proceed by feature ablation - Re-run prediction by removing 1 feature at a time - 1000 samples 30
  31. 31. Significant Features - What features contribute best when the prediction works reasonably - We proceed by feature ablation - Re-run prediction by removing 1 feature at a time. - 1000 samples HITs_Available (Number of tasks in the batch) Age_Minutes (how long ago the batch was created) 31
  32. 32. 4) Market Analysis 32 Demand - The number of new tasks published on the platform by the requesters Supply - The workforce that the crowd is providing
  33. 33. Supply Elasticity How does the market reacts when new tasks arrive on the platform? 33
  34. 34. Supply Elasticity We regressed the percentage of work done (within 1 Hour) against the number of new HITs 34
  35. 35. Supply Elasticity Intercept = 2.5 Slope = 0.5% 20% of new work gets completed within an hour 35
  36. 36. Supply Elasticity Intercept = 2.5 Slope = 0.5% 20% of new work gets completed within an hour 36
  37. 37. Demand and Supply Periodicity Demand Supply 37
  38. 38. Demand and Supply Periodicity Strong weekly periodicity 7-10 days. 38
  39. 39. Conclusions - Long time data analysis uncovers some hidden trends - Large scale HIT classification - Important features in throughput prediction (HITs available, Age_minutes) - Supply is Elastic - (More work available -> More work Done) - Supply and Demand are periodic (7-10days) 39
  40. 40. Is a Crowdsourcing Marketplace the right paradigm for efficient and predictable crowdsourcing? 40
  41. 41. Is a Crowdsourcing Marketplace the right paradigm for efficient and predictable crowdsourcing? 41 Q&A Djellel Difallah ded@exascale.info

×