Programming with People

5,043 views

Published on

Humans can perform many tasks with ease that remain difficult or impossible for computers. Crowdsourcing platforms like Amazon's Mechanical Turk make it possible to harness human-based computational power on an unprecedented scale. However, their utility as a general-purpose computational platform remains limited. The lack of complete automation makes it difficult to orchestrate complex or interrelated tasks. Scheduling human workers to reduce latency costs real money, and jobs must be monitored and rescheduled when workers fail to complete their tasks. Furthermore, it is often difficult to predict the length of time and payment that should be budgeted for a given task. Finally, the results of human-based computations are not necessarily reliable, both because human skills and accuracy vary widely, and because workers have a financial incentive to minimize their effort.

This talk presents AutoMan, the first fully automatic crowdprogramming system. AutoMan integrates human-based computations into a standard programming language as ordinary function calls, which can be intermixed freely with traditional functions. This abstraction allows AutoMan programmers to focus on their programming logic. An AutoMan program specifies a confidence level for the overall computation and a budget. The AutoMan runtime system then transparently manages all details necessary for scheduling, pricing, and quality control. AutoMan automatically schedules human tasks for each computation until it achieves the desired confidence level; monitors, reprices, and restarts human tasks as necessary; and maximizes parallelism across human workers while staying under budget.

AutoMan is available for download at www.automan-lang.org.

Published in: Technology, Career
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
5,043
On SlideShare
0
From Embeds
0
Number of Embeds
4,079
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Programming with People

  1. 1. Dan  Barowy,  Charlie  Curtsinger,  Emery  Berger,  Andrew  McGregor  Programming  with  People:  Integra(ng  Human-­‐Based  &  Digital  Computa(on  
  2. 2. Computersreally good atsome tasks…decoding human genome
  3. 3. designing your next car…
  4. 4. blood flow simulation…
  5. 5. (science)blood flow simulation…
  6. 6. Not so good atother tasks…
  7. 7. “is this a giraffe?”
  8. 8. isGiraffe(image)(not a real function!)
  9. 9. Can we“implement”this function?isGiraffe( )
  10. 10. Can we“implement”this function?isGiraffe( )We could just ask people!
  11. 11. isGiraffe( )Find people via MTurk…
  12. 12. h"p://mturk.com  MTurk = Amazon’s“Mechanical Turk”
  13. 13. A sample task.
  14. 14. Original “MechanicalTurk”: 18th Centurychess machine (!)
  15. 15. The secret. Amazon’sservice also looks like acomputer but haspeople inside.
  16. 16. Now weimplementthis functionwith MTurkworkers…isGiraffe( )True
  17. 17. Q1: Howmuch is thistask worth?isGiraffe( )True
  18. 18. isGiraffe( )TrueQ2: Howmuch timeshould ittake?
  19. 19. isGiraffe( )FalseQ3: What ifworker getsit wrong?
  20. 20. Why woulda worker getit wrong?False
  21. 21. Incompetent orlazy (“Homer”)False
  22. 22. Bots (Internet + $ = Scammers)False
  23. 23. Both justguessinganswers.
  24. 24. Or…evil genius:deliberately pickswrong answer.False
  25. 25. random adversary modelNo need to worryabout evil geniuses:we are fine with a
  26. 26. random adversary modeli.e., Homer & Bender = OK.
  27. 27. random adversary modeli.e., Homer & Bender = OK.We rule Dr. Evil out. Why?
  28. 28. long-term financial incentiveMTurk tracks approvalrates and work records:
  29. 29. credentials limit Sybil attacksUse of financial credentialsmake it hard to gin up newaccounts, and
  30. 30. isGiraffe( )?  But how do we knowany one person is not aHomer or a Bender?
  31. 31. isGiraffe( )Idea: vote. If both agreeon the answer, we’rehappy, right?
  32. 32. Pr[agree]  =  1/2  isGiraffe( )Not so much. Random =50% chance of agreement.
  33. 33. Pr[agree]  =  1/32  <  5%  BUT: can dramaticallyreduce likelihood byincreasing # of workers.
  34. 34. Pr[agree]  =  k(1/k)n  (unanimous  agreement)    k = # choices, n = # workers.(see paper for more details)
  35. 35. é choicesê Pr[agree]More choices = fewer people
  36. 36. isGiraffe( )( )AutoMan: DSL in Scala– runs on any JVM.
  37. 37. isGiraffe( )( )Total $ for computationAutoMan programmer-specified
  38. 38. isGiraffe( )( )Total $ for computationConfidence level(per function)95%(p < 0.05)AutoMan programmer-specified
  39. 39. isGiraffe( )US minimum wage,Adaptive doublingAutoMan ties pay and time;Doubles both if no workers.
  40. 40. isGiraffe( )US minimum wage,Adaptive doubling30s, $0.06 ($7.25 / 120)Initially: all tasks 30 seconds.
  41. 41. isGiraffe( )US minimum wage,Adaptive doublingprevents gaming30s, $0.06 ($7.25 / 120)60s, $0.12No one shows up: both doubled.
  42. 42. You might thinkthis is exploitable:just wait for jobs torise in price.
  43. 43. Strategy failswhen otherpeople arearound &grab money.
  44. 44.                          =  base  (Pavail)round                                  *  mul=plierround  E[gain]    Math: workersshould never wait.Expectedearningsafter some #of rounds…
  45. 45. =  base  (½)round        *  mul=plierround  E[gain]    Even oddssomebodywill take themoney…
  46. 46. E[gain]    =  base  (½)round        *  2round  Doublingincreaseswage aftereach round…
  47. 47. =  base          E[gain]    Termsdependent onroundscancel out.
  48. 48. no incentive to wait=  base          E[gain]    
  49. 49. isGiraffe( )True 95% confidenceAutoManmanagestime, $,quality.
  50. 50. How  many  giraffes  are  in  this  picture?  k = 3 choices!AutoManhandles“radiobutton”questions
  51. 51. How  many  giraffes  are  in  this  picture?  k = 3 choices!Risk: Homer &Bender always guess
  52. 52. How  many  giraffes  are  in  this  picture?  k = 3 choices!E.g., always choosefirst option.
  53. 53. How  many  giraffes  are  in  this  picture?  k = 3 choices!To combat this,AutoMan randomizesanswers.
  54. 54. 25 choices!Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    ☐  ☐  ☐  ☐  ☐  “Checkbox” questions
  55. 55. Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    þ þ þ þ þ 25 choices!Same risk: random respondents
  56. 56. Which  are  from  Sesame  Street?  Kermit  the  Frog              Spongebob  Squarepants  Cookie  Monster                                          The  Count  Oscar  the  Grouch    þ þ ☐  þ ☐  25 choices!AutoMan checks each randomly
  57. 57. What  does  this  license  plate  say?  36d choices!XXXXXX366 choices = !2176782336[A-Z0-9]{6}!Last question category:constrained free-text
  58. 58. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  Example real execution
  59. 59. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  
  60. 60. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  
  61. 61. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  2m  50s  
  62. 62. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  tasks  @  $0.06;  30s  work    t1   t2   t3  1m  50s  2m  30s  2m  50s  Inconclusive!  
  63. 63. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  
  64. 64. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  7m  
  65. 65. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  18m  50s  7m  
  66. 66. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawns  3  more  tasks  t1   t2   t3   t4   t5   t6  7m  18m  50s  51m  Timeout: double pay and time
  67. 67. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawn  1  more  task  @  $0.12,  60s  work  t1   t2   t3   t4   t5   t6   t7  
  68. 68. Which  one  of  these  doesn’t  belong?  [95%  conf.]  AUTOMAN:  spawn  1  more  task  @  $0.12,  60s  work  t1   t2   t3   t4   t5   t6   t7  1h  9m  50s;  cost  =  $0.36  AUTOMAN:  5  out  of  6    ⇒  95%  confidence;  return      
  69. 69. read_plate( )More sophisticated function:
  70. 70. 12.2%  [Maryland  State  Highway  Administra=on]  Success rate of real system!
  71. 71. Easy under optimal conditions
  72. 72. More complex in general
  73. 73. “Difficult” set of plates
  74. 74. Easier to read than CAPTCHAs!
  75. 75. Real task as posted on MTurk
  76. 76. Workflow: pictures to strings
  77. 77. def is_car(img_url: String) =a.RadioButtonQuestion { q =>q.budget = 1.00q.confidence = 0.95q.text = “Is this a car?”q.image_url = img_urlq.options = List(a.Option(yes, ”Yes"),a.Option(no, ”No”))}Actual AutoMan code:
  78. 78. def get_plate_text(img_url: String) =a.FreeTextQuestion { q =>q.text = ”What does this platesay?"q.image_url = img_urlq.pattern = "XXXXXYYY”}Actual AutoMan code:
  79. 79. t1 t2 t3 t3 t4 t5 t6 t7 t8"Is this a vehicle?"start end$0.06post tasksw1:yesw2:yesw3:yes3 answersw4:yes1 answerw5:yesTask 1Task 2Task 3Task 4Task 51 answer"What does the license plate say?"unanimousagreement!post tasks$0.06workersdisagree!2 answers post tasksTask 8Task 9timeout!$0.12$0.06post tasksXcancelled!1 answerend767JKF  yes  w6:767JFKw7:767JKFTask 6Task 7w8:767JKFTask10Task11Example execution
  80. 80. MediaLab  LPR  database    “extremely  dif-icult”  dataset  144  plates  Accuracy:  91.6%  Average  cost:  12.08  cents  Latency:  <  2  minutes  per  image          >12.2%!  AutoMan evaluation
  81. 81. www.automan-­‐lang.org  AUTOMAN:  Programming  with  People  read_plate( )def read_plate(url:String) =a.FreeTextQuestion { q =>q.text = ”What does thisplate say?”q.image_url = urlq.pattern = "XXXXXYYY”}Dan  Barowy,  Charlie  Curtsinger,  Emery  Berger,  Andrew  McGregor  

×