Successfully reported this slideshow.

Crowd scheduling www2016



Loading in …3
1 of 18
1 of 18

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Crowd scheduling www2016

  1. 1. Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems Djellel Eddine Difallah, University of Fribourg, CH Gianluca Demartini, University of Sheffield, UK Philippe Cudré-Mauroux, University of Fribourg, CH
  2. 2. Introduction • Crowdsourcing relies on a large pool of humans to perform complex tasks (paid workers, volunteers, players etc) • A Crowdsourcing platform (e.g., CrowdFlower, Amazon MTurk) allows requesters to tap into a pool of paid workers in a shared resources fashion • Requesters would publish batches of similar tasks to be completed in exchange of a monetary reward • Workers can arrive and leave at any point in time and can selectively focus on an arbitrary subset of the tasks only 2
  3. 3. Introduction Observations • Few workers perform many tasks, followed by a long tail of workers performing fewer tasks [Ipeirotis 2010; Franklin et al. 2011] • Large jobs are fast at the beginning, then they lose their momentum toward the end [Difallah et al. 2014] • We suspect that this leads to batches being treated unequally. (Batch Size, Freshness, Requester, Price) [Difallah et al. 2015] 3
  4. 4. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Count(Normalized) (a) Batch distribution per Size. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Throughput(Normalized) (b) Cumulative Throughput per Batch Size. Introduction Data Analysis • Most of the Batches present on AMT have 10 HITs or less • The overall platform throughput is dominated by larger batches Tiny[0,10] Small[10,100] Medium[100,1000] Large[1000,Inf] 4
  5. 5. Motivation The case of Multi-Tenant Crowd-powered Systems (CPS) • Definition: A CPS serves multiple customers/users (e.g., a Crowd DBMS) • The system posts a batch of tasks on the crowdsourcing platform per user query • The CPS is in constant competition to attract workers • With itself — multiple tenants • With other requesters • Job starvation is problematic in business applications 5
  6. 6. Contributions • We design a novel crowdsourcing system architecture that allows job scheduling for a CPS on top of a traditional crowdsourcing platform • We devise a scheduling algorithm that embodies a set of general design requirements • We empirically evaluate our setup on Amazon MTurk, with real crowd and a set of scheduling algorithms 6
  7. 7. HIT-Bundle Definition • Scheduling requires that we have control over the serving process of tasks • A HIT-Bundle is a batch that contains heterogeneous tasks • All tasks that are generated by the CPS are published through the HIT-Bundle HIT-Bundle Batch 1 Batch 2 Batch 3 Batch 4 7
  8. 8. HIT-Bundle Micro Experiment • Comparison of batch execution time using different grouping strategies • Distinct batches • Combined in a HIT-Bundle 0 25 50 75 100 0 1000 2000 3000 4000 Time (seconds) #HITsRemaining B6 − Bundle B7 − Bundle B6 B7 8
  9. 9. Proposed CPS Architecture Crowdsourcing Decision Engine HIT-Bundle Manager Multi-Tenant Crowd-Powered System Crowdsourcing Platform Progress Monitor API HIT Scheduler Human Workers c1 a1b3.. Queue Crowdsourcing App HIT Collection and Reward HIT Results Aggregator HIT Manager Scheduler External HIT Page Batch A $$ Batch B $$$ Batch C $ .. Batch Catalog HIT-Bundle Creation/Update Batch Merging StatusMETA System Crowdsourced queries Batch Input Merger Resource Tracker config_file 9
  10. 10. Scheduling for the Crowd Design Guidelines • (R1) Runtime Scalability: Adopt a runtime scheduler that a) dynamically adapts to the current availability of the crowd, and b) scales to make real-time scheduling decisions as the work demand grows higher • (R2) Fairness: The scheduler must provide a steady progress to large requests without blocking or starving, the smaller requests • (R3) Priority: The scheduler must be sensitive to clients who have higher priority (e.g., those who pay more) • (R4) Human Aware: Unlike machines, people performances are impacted by many factors including context switching, training effects, boringness, task difficulty and interestingness 10
  11. 11. (Weighted) Fair Scheduler • Fair Scheduling FS (R1) (R2): • Keep track of how many tasks per batch are currently assigned running_tasks • Assign task with min running_tasks • The Weighted Fair Sharing WFS variant (R3): • Compute a weight, based on priority (e.g., price) • weight(Bj) = p(Bj)/sum(p(B)) • Assign task with
 min running_tasks/weight • Pros. ensures that all the batches receive proportional number of workers available • Cons. We don’t satisfy (R4) Human Awareness HIT-Bundle 7 tasks running 1. get_task() FS: return( ) WFS: return( ) 2. p=0.1$ w= 0.5 p=0.05$ w= 0.25 p=0.05$ w= 0.25 11
  12. 12. Worker Context Switch Micro Experiment • We run a HIT Bundle with heterogenous tasks • Compute average execution time for each HIT • RR: Round Robin, task type changes every time • SEQ10 / SEQ25: Task types are alternated every 10, respectively 25 tasks • The mean task execution time is significantly lower for SEQ25 ● ● ● ● ● ● ● ● ● ** (p−value=0.023)** (p−value=0.023) 20 40 60 RR SEQ10 SEQ25 Experiment Type ExecutiontimeperHIT(Seconds) RR SEQ10 SEQ25 12
  13. 13. Worker Conscious Fair Scheduling WCFS • Goal: Reduce the context switch introduced by having the worker continuously switch tasks types • We modify Fair Sharing with Delayed Scheduling [Zaharia et al. 2010] • A task will give up its priority K times until a worker who just completed a similar task is available again • Pros. we satisfy all our design requirements. A worker receives longer sequences of similar tasks • Cons. Need to set K 13
  14. 14. Experiments Controlled Setup • On Amazon Mechanical Turk (no simulations) • HIT-Bundle with 5 different task types • We artificially ensure that we have num_workers >10 before starting an experiment • We compare against basic schedulers First In First Out (FIFO), Round Robin (RR), Shortest Job First (SJF) 14
  15. 15. Controlled Experiments Latency All experiment are run in parallel FIFO order [B1, B2, B3, B4, B5] SJF order [B4, B3, B5, B2, B1] based on previous evidence • FIFO finishes jobs one after the other • Wile SJF finishes the shortest jobs first • FS and RR offer a balanced workforce 0 500 1000 1500 2000 B1 B2 B3 B4 B5 Batch Time(Seconds) FIFO FS RR SJF (a) Batch Latency 0 500 1000 1500 2000 FIFO FS RR SJF Scheduling Scheme Time(Seconds) (b) Overall Experiment Latency 15
  16. 16. 0 300 600 900 B1 B2 B3 B4 B5 Batch Time(seconds) B2:$0.02 B2:$0.05 (a)Vary The Price 0 250 500 750 1000 B1 B2 B3 B4 B5 Batch Time(seconds) 10 workers 20 workers (b) Vary The Workforce Experiments Varying the Control Factors Weighted Fair Scheduler is used • (a) Effect of increasing B2’s priority (Price) on batch execution time • B2 executes faster • (b) Effect of varying the number of crowd workers involved in the completion of the HIT batches • The load is rebalanced (albeit, with different proportions) but all batches had a speed increase 16
  17. 17. Experiments in the Wild Execution Trace 0 10 20 30 0 10 20 30 0 10 20 30 FSIndividualBatchesWCFS 12:20 12:30 12:40 12:50 Time #ActiveWorkers
  18. 18. Conclusions • Batch starvation in crowdsourcing is problematic for requesters • We introduce a new scheduling layer that shares a pool of crowd workers among multiple tenants of a crowd-powered system • We perform evaluations in a real setup with real workers • We show that an HIT-Bundle increases the overall throughput • Our technique (Worker Conscious Fair Sharing), inspired from large scale data processing frameworks, minimises context switch • Toward Service Level Agreement aware scheduling for crowdsourcing platforms. Code: