• Save
Programming with People
Upcoming SlideShare
Loading in...5
×
 

Programming with People

on

  • 1,754 views

Humans can perform many tasks with ease that remain difficult or impossible for computers. Crowdsourcing platforms like Amazon's Mechanical Turk make it possible to harness human-based computational ...

Humans can perform many tasks with ease that remain difficult or impossible for computers. Crowdsourcing platforms like Amazon's Mechanical Turk make it possible to harness human-based computational power on an unprecedented scale. However, their utility as a general-purpose computational platform remains limited. The lack of complete automation makes it difficult to orchestrate complex or interrelated tasks. Scheduling human workers to reduce latency costs real money, and jobs must be monitored and rescheduled when workers fail to complete their tasks. Furthermore, it is often difficult to predict the length of time and payment that should be budgeted for a given task. Finally, the results of human-based computations are not necessarily reliable, both because human skills and accuracy vary widely, and because workers have a financial incentive to minimize their effort.

This talk presents AutoMan, the first fully automatic crowdprogramming system. AutoMan integrates human-based computations into a standard programming language as ordinary function calls, which can be intermixed freely with traditional functions. This abstraction allows AutoMan programmers to focus on their programming logic. An AutoMan program specifies a confidence level for the overall computation and a budget. The AutoMan runtime system then transparently manages all details necessary for scheduling, pricing, and quality control. AutoMan automatically schedules human tasks for each computation until it achieves the desired confidence level; monitors, reprices, and restarts human tasks as necessary; and maximizes parallelism across human workers while staying under budget.

AutoMan is available for download at www.automan-lang.org.

Statistics

Views

Total Views
1,754
Views on SlideShare
477
Embed Views
1,277

Actions

Likes
0
Downloads
0
Comments
0

4 Embeds 1,277

http://plasma.cs.umass.edu 1241
http://people.cs.umass.edu 30
http://translate.googleusercontent.com 5
http://prlog.ru 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Programming with People Programming with People Presentation Transcript

  • Dan  Barowy,  Charlie  Curtsinger,  Emery  Berger,  Andrew  McGregor  Programming  with  People:  Integra(ng  Human-­‐Based  &  Digital  Computa(on  
  • Computersreally good atsome tasks…decoding human genome
  • designing your next car…
  • blood flow simulation…
  • (science)blood flow simulation…
  • Not so good atother tasks…
  • “is this a giraffe?”
  • isGiraffe(image)(not a real function!)
  • Can we“implement”this function?isGiraffe( )
  • Can we“implement”this function?isGiraffe( )We could just ask people!
  • isGiraffe( )Find people via MTurk…
  • h"p://mturk.com  MTurk = Amazon’s“Mechanical Turk”
  • A sample task.
  • Original “MechanicalTurk”: 18th Centurychess machine (!)
  • The secret. Amazon’sservice also looks like acomputer but haspeople inside.
  • Now weimplementthis functionwith MTurkworkers…isGiraffe( )True
  • Q1: Howmuch is thistask worth?isGiraffe( )True
  • isGiraffe( )TrueQ2: Howmuch timeshould ittake?
  • isGiraffe( )FalseQ3: What ifworker getsit wrong?
  • Why woulda worker getit wrong?False
  • Incompetent orlazy (“Homer”)False
  • Bots (Internet + $ = Scammers)False
  • Both justguessinganswers.
  • Or…evil genius:deliberately pickswrong answer.False
  • random adversary modelNo need to worryabout evil geniuses:we are fine with a
  • random adversary modeli.e., Homer & Bender = OK.
  • random adversary modeli.e., Homer & Bender = OK.We rule Dr. Evil out. Why?
  • long-term financial incentiveMTurk tracks approvalrates and work records:
  • credentials limit Sybil attacksUse of financial credentialsmake it hard to gin up newaccounts, and
  • isGiraffe( )?  But how do we knowany one person is not aHomer or a Bender?
  • isGiraffe( )Idea: vote. If both agreeon the answer, we’rehappy, right?
  • Pr[agree]  =  1/2  isGiraffe( )Not so much. Random =50% chance of agreement.
  • Pr[agree]  =  1/32  <  5%  BUT: can dramaticallyreduce likelihood byincreasing # of workers.
  • Pr[agree]  =  k(1/k)n  (unanimous  agreement)    k = # choices, n = # workers.(see paper for more details)
  • é choicesê Pr[agree]More choices = fewer people
  • isGiraffe( )( )AutoMan: DSL in Scala– runs on any JVM.
  • isGiraffe( )( )Total $ for computationAutoMan programmer-specified
  • isGiraffe( )( )Total $ for computationConfidence level(per function)95%(p < 0.05)AutoMan programmer-specified
  • isGiraffe( )US minimum wage,Adaptive doublingAutoMan ties pay and time;Doubles both if no workers.
  • isGiraffe( )US minimum wage,Adaptive doubling30s, $0.06 ($7.25 / 120)Initially: all tasks 30 seconds.
  • isGiraffe( )US minimum wage,Adaptive doublingprevents gaming30s, $0.06 ($7.25 / 120)60s, $0.12No one shows up: both doubled.
  • You might thinkthis is exploitable:just wait for jobs torise in price.
  • Strategy failswhen otherpeople arearound &grab money.
  •                          =  base  (Pavail)round                                  *  mul=plierround  E[gain]    Math: workersshould never wait.Expectedearningsafter some #of rounds…
  • =  base  (½)round        *  mul=plierround  E[gain]    Even oddssomebodywill take themoney…
  • E[gain]    =  base  (½)round        *  2round  Doublingincreaseswage aftereach round…
  • =  base          E[gain]    Termsdependent onroundscancel out.
  • no incentive to wait=  base          E[gain]    
  • isGiraffe( )True 95% confidenceAutoManmanagestime, $,quality.
  • How  many  giraffes  are  in  this  picture?  k = 3 choices!AutoManhandles“radiobutton”questions
  • How  many  giraffes  are  in  this  picture?  k = 3 choices!Risk: Homer &Bender always guess
  • How  many  giraffes  are  in  this  picture?  k = 3 choices!E.g., always choosefirst option.
  • How  many  giraffes  are  in  this  picture?  k = 3 choices!To combat this,AutoMan randomizesanswers.
  • 25 choices!Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    ☐  ☐  ☐  ☐  ☐  “Checkbox” questions
  • Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    þ þ þ þ þ 25 choices!Same risk: random respondents
  • Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    þ þ ☐  þ ☐  25 choices!AutoMan checks each randomly
  • What  does  this  license  plate  say?  36d choices!XXXXXX366 choices = !2176782336[A-Z0-9]{6}!Last question category:constrained free-text
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  Example real execution
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  2m  50s  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  2m  50s  Inconclusive!  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  7m  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  18m  50s  7m  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  7m  18m  50s  51m  Timeout: double pay and time
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawn  1  more  task  @  $0.12,  60s  work  t1   t2   t3   t4   t5   t6   t7  
  • Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawn  1  more  task  @  $0.12,  60s  work  t1   t2   t3   t4   t5   t6   t7  1h  9m  50s;  cost  =  $0.36  AUTOMAN:  5  out  of  6    ⇒  95%  confidence;  return      
  • read_plate( )More sophisticated function:
  • 12.2%  [Maryland  State  Highway  Administra=on]  Success rate of real system!
  • Easy under optimal conditions
  • More complex in general
  • “Difficult” set of plates
  • Easier to read than CAPTCHAs!
  • Real task as posted on MTurk
  • Workflow: pictures to strings
  • def is_car(img_url: String) =a.RadioButtonQuestion { q =>q.budget = 1.00q.confidence = 0.95q.text = “Is this a car?”q.image_url = img_urlq.options = List(a.Option(yes, ”Yes"),a.Option(no, ”No”))}Actual AutoMan code:
  • def get_plate_text(img_url: String) =a.FreeTextQuestion { q =>q.text = ”What does this platesay?"q.image_url = img_urlq.pattern = "XXXXXYYY”}Actual AutoMan code:
  • t1 t2 t3 t3 t4 t5 t6 t7 t8"Is this a vehicle?"start end$0.06post tasksw1:yesw2:yesw3:yes3 answersw4:yes1 answerw5:yesTask 1Task 2Task 3Task 4Task 51 answer"What does the license plate say?"unanimousagreement!post tasks$0.06workersdisagree!2 answers post tasksTask 8Task 9timeout!$0.12$0.06post tasksXcancelled!1 answerend767JKF  yes  w6:767JFKw7:767JKFTask 6Task 7w8:767JKFTask10Task11Example execution
  • MediaLab  LPR  database    “extremely  dif-icult”  dataset  144  plates  Accuracy:  91.6%  Average  cost:  12.08  cents  Latency:  <  2  minutes  per  image          >12.2%!  AutoMan evaluation
  • www.automan-­‐lang.org  AUTOMAN:  Programming  with  People  read_plate( )def read_plate(url:String) =a.FreeTextQuestion { q =>q.text = ”What does thisplate say?”q.image_url = urlq.pattern = "XXXXXYYY”}Dan  Barowy,  Charlie  Curtsinger,  Emery  Berger,  Andrew  McGregor