Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
AN EXPLORATIVE APPROACH FOR
CROWDSOURCING TASKS DESIGN
Marco Brambilla
Stefano Ceri
Andrea Mauri
Riccardo Volonterio
Introduction
• OBJECTIVE: selecting the best execution strategy for the
specific human computation task
• ISSUE 1: Dealing...
Current approaches
• Tool to simplify the configuration
• Do not provide support on PROs and
CONs of alternatives in setti...
Our Approach to strategy selection
• We propose a domain-independent, explorative design
method
• Rapid prototyping and ex...
Conceptual Model
SOCM'15, Monday, May 18 5An explorative approach for Crowdsourcing tasks design
Conceptual Model (2)
• Platform: where the task will be executed
• Cardinality: the number of object shown to the performe...
Candidate Strategy
• Each candidate strategies is thus represented by a set of
parameters describing the model instance co...
Quality measures
Strategies need to be evaluated by using a set of quality
measures
• Cohen’s kappa coefficient: a statist...
Evaluation of the strategies
Split the
dataset in 2
(small and
large)
Run all the
strategies on
the small
dataset
Collect ...
Experiment
Two main assumptions
1. The execution of a strategy on the small and large
datasets are correlated
2. The cost ...
Experiment (2)
• We designed an image labeling crowdsourcing task in
which we ask the crowd to classify pictures related t...
Experiment (3)
• Then we selected 8 different strategies and we ran them
on both the small and large dataset
(to validate ...
Experiment (4)
• We calculated all quality measures of the strategies
• Selection of best strategy depends on weight given...
Results
• First assumption:
• we calculated the Pearson correlation coefficient, for each design
dimension
SOCM'15, Monday...
Results (2)
• Second assumption:
• Cost for executing all the 8 strategies on the small dataset: $22.49
• Cost for executi...
Conclusion
• Our method is applicable and can lead to quantifiable
advantages of cost and quality
• Trade-off between the ...
Thanks for your attention
Any Questions?
Stefano Ceri stefano.ceri@polimi.it
Marco Brambilla marco.brambilla@polimi.it
And...
Upcoming SlideShare
Loading in …5
×

An explorative approach for Crowdsourcing tasks design

250 views

Published on

Slides presented at the Third International Workshop on the Theory and Practice of Social Machines (at WWW2015 conference in Florence)

Published in: Software
  • Be the first to comment

  • Be the first to like this

An explorative approach for Crowdsourcing tasks design

  1. 1. AN EXPLORATIVE APPROACH FOR CROWDSOURCING TASKS DESIGN Marco Brambilla Stefano Ceri Andrea Mauri Riccardo Volonterio
  2. 2. Introduction • OBJECTIVE: selecting the best execution strategy for the specific human computation task • ISSUE 1: Dealing with crowds introduces many concurring objectives and constraints • ISSUE 2: Very large datasets, high costs of selecting the wrong strategy • Performers • Selection • Rewarding • Cost • Object specific or global • Time • Quality • Convergence criteria SOCM'15, Monday, May 18 2An explorative approach for Crowdsourcing tasks design
  3. 3. Current approaches • Tool to simplify the configuration • Do not provide support on PROs and CONs of alternatives in settings definition • Define a mathematical formulation of the problem • small set of decisions • NP-hard classes SOCM'15, Monday, May 18 3An explorative approach for Crowdsourcing tasks design
  4. 4. Our Approach to strategy selection • We propose a domain-independent, explorative design method • Rapid prototyping and execution in the small in order to select the design parameters to be used for big datasets SOCM'15, Monday, May 18 4An explorative approach for Crowdsourcing tasks design Define a representative set of execution strategies Execute them on a small dataset Collect quality measures Decide the strategy to be used with the complete dataset
  5. 5. Conceptual Model SOCM'15, Monday, May 18 5An explorative approach for Crowdsourcing tasks design
  6. 6. Conceptual Model (2) • Platform: where the task will be executed • Cardinality: the number of object shown to the performer • Reward: e.g., the cost of a HIT on Amazon Mechanical Turk, or game rewards • Agreement: e.g., majority based decision for each object This list can be extended in order to satisfy specific user needs SOCM'15, Monday, May 18 6An explorative approach for Crowdsourcing tasks design
  7. 7. Candidate Strategy • Each candidate strategies is thus represented by a set of parameters describing the model instance considered S = {s1, s2, . . . , sn} where n is the number of considered parameters • Example: • an execution on Amazon Mechanical Turk • 3 objects per HIT, • “2 workers over 3” agreement • 0.01$ per answer Sexample = [“AMT”, 3, 2/3,0.01] SOCM'15, Monday, May 18 7An explorative approach for Crowdsourcing tasks design
  8. 8. Quality measures Strategies need to be evaluated by using a set of quality measures • Cohen’s kappa coefficient: a statistical measure of inter- annotator agreement for categorical annotation tasks • Precision of responses: percent of correct responses • Execution time: the elapsed time needed to complete the whole task. • Cost: the total amount of money spent or impact on the social network cause by our activity. SOCM'15, Monday, May 18 8An explorative approach for Crowdsourcing tasks design
  9. 9. Evaluation of the strategies Split the dataset in 2 (small and large) Run all the strategies on the small dataset Collect the quality measure(s) Select the “best” strategy SOCM'15, Monday, May 18 9An explorative approach for Crowdsourcing tasks design With |small| << |large|
  10. 10. Experiment Two main assumptions 1. The execution of a strategy on the small and large datasets are correlated 2. The cost of performing all experiments in the small followed by one (the best) experiment in the large is affordable SOCM'15, Monday, May 18 10An explorative approach for Crowdsourcing tasks design
  11. 11. Experiment (2) • We designed an image labeling crowdsourcing task in which we ask the crowd to classify pictures related to actor. • Design dimensions • Number of images shown in each microtask • Agreement level for each picture • Cost of each AMT HIT • Dataset • 900 images related to actors retrieved from Google Images • Subselection of 90 random images as small dataset SOCM'15, Monday, May 18 11An explorative approach for Crowdsourcing tasks design
  12. 12. Experiment (3) • Then we selected 8 different strategies and we ran them on both the small and large dataset (to validate correlation hyp.) SOCM'15, Monday, May 18 12An explorative approach for Crowdsourcing tasks design
  13. 13. Experiment (4) • We calculated all quality measures of the strategies • Selection of best strategy depends on weight given to the measures • E.g., in the example we compared the strategies wrt the trade-off between precision and cost SOCM'15, Monday, May 18 13An explorative approach for Crowdsourcing tasks design
  14. 14. Results • First assumption: • we calculated the Pearson correlation coefficient, for each design dimension SOCM'15, Monday, May 18 14An explorative approach for Crowdsourcing tasks design Cost Precision Agreement Duration Pearson 0.999 0.619 0.707 0.915
  15. 15. Results (2) • Second assumption: • Cost for executing all the 8 strategies on the small dataset: $22.49 • Cost for executing the selected strategy: $16.86 • Total: 39.95$ • The difference between the cost of experiments in the small and in the large increases a lot with big input data • Hint: in real scenarios (tens of K of objects), numerosity of small vs. big >= 2 orders of magnitude • If you selected a random strategy, you may have found worst quality and higher cost SOCM'15, Monday, May 18 15An explorative approach for Crowdsourcing tasks design
  16. 16. Conclusion • Our method is applicable and can lead to quantifiable advantages of cost and quality • Trade-off between the additional cost and the added value is affordable Future Works • Formalizing the process for selecting candidate strategies and the “best” one (currently empirical selection) • Iterative tuning: multi-level or separate dimensions • Testing on bigger datasets and with more design dimensions SOCM'15, Monday, May 18 16An explorative approach for Crowdsourcing tasks design
  17. 17. Thanks for your attention Any Questions? Stefano Ceri stefano.ceri@polimi.it Marco Brambilla marco.brambilla@polimi.it Andrea Mauri andrea.mauri@polimi.it Riccardo Volonterio riccardo.volonterio@polimi.it SOCM'15, Monday, May 18 17An explorative approach for Crowdsourcing tasks design

×