AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

194 views

Published on

With Amazon Mechanical Turk (MTurk), you can leverage the power of the crowd for a host of tasks ranging from image moderation and video transcription to data collection and user testing. You simply build a process that submit tasks to the Mechanical Turk marketplace and get results quickly, accurately, and at scale. In this session, Russ, from Rainforest QA, shares best practices and lessons learned from his experience using MTurk. The session covers the key concepts of MTurk, getting started as a Requester, and using MTurk via the API. You learn how to set and manage Worker incentives, achieve great Worker quality, and how to integrate and scale your crowdsourced application. By the end of this session, you will have a comprehensive understanding of MTurk and know how to get started harnessing the power of the crowd.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
194
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Russell Smith, co-founder/CTO/CIO, Rainforest QA November 2016 BDA204 Leverage the Power of the Crowd To Work with Amazon Mechanical Turk
  2. 2. What to Expect from the Session • Learn what Mechanical Turk (MTurk) is • Understand the basics • Learn about scaling beyond the basics • How Rainforest leverages MTurk
  3. 3. Who am I? Russell Smith • CTO & Co-Founder of Rainforest QA • Programmer • MTurk Requester for ~5 years • ~>250m questions through MTurk • Can follow me on twitter — @rhs
  4. 4. What is Rainforest? QA-as-a-Service: Fast Crowdsourced Testing for Web and Mobile Apps thanks to Mechanical Turk: • Customers write tests in plain English • Results in ~30 minutes, anytime, 24x7 • Powered by humans
  5. 5. What is Mechanical Turk? • Super early AWS service • Public since 2005 • First invented in 2001 • 24 x 7, on-demand, programmatic interface to do Human Intelligence Tasks (HITs) • “Automate” the un-automatable
  6. 6. What is Mechanical Turk? • Pay (lots of) humans to do (lots of) things. Classic things: • Extract data from receipts • Identify things in photos • Search for data for you (find the phone number of XYZ restaurant) • Transcribe audio • More hip / upcoming things • Data science – build ground truth for machine learning and AI
  7. 7. Basics
  8. 8. Marketplace • Connects Workers and Requesters • Requesters are you! • Web-interface where Workers execute your tasks • Searchable list of HITs, Workers pick
  9. 9. Requester interface 1. Select a template 2. Provide info on your task and how much you want to pay. 3. Design the layout of your task 4. Load your variables 5. Publish
  10. 10. Requester interface - The results of your task can be viewed in the Manage tab. - This is also where you can view and manage your Workers.
  11. 11. Worker interface - Workers visit mturk.com to find HITs they want to work on. - Description, reward, and reputation all matter in determining if your work gets done.
  12. 12. Worker interface - Workers can choose to Accept a HIT or Skip to the next one in a set. - Once they’ve accepted the HIT they have until the allotted time has expired to Submit. - Workers can also Return the task if they decide they don’t want to complete it.
  13. 13. Basics - task design
  14. 14. Basics - Task design Design is critical: • Bad tasks = bad reputation + bad results • Unclear tasks = bad reputation + bad results • Good tasks ~= good reputation + good results
  15. 15. Basics - Task design My rules: 1. Have instructions and/or rules 2. Must be clear to understand (note, not necessarily simple) 3. Must protect against mistakes or fraud 4. Have a fair price 5. Include a feedback field
  16. 16. Basics - Task design Ask: • Can the worker get in a groove and churn through tasks? • Can anyone read the instructions and do this right? • Do we need to qualify the workers?
  17. 17. Basics - Task design Pricing iteration 1. Work out a budget per assignment 2. Do a small run 3. Verify quality vs speed* of results 4. Fix your task, optimize spend** and goto 4 (repeat forever) * Qualifications, SEO, # of workers ** Payment, repetition, requirements
  18. 18. Workers
  19. 19. Workers
  20. 20. Workers
  21. 21. Workers • Motivations • Earn money • Status • Incentives • Leveling up • Pride • Expectations • Traditionally being treated like an API • Now; being treated like a human • Fairness, transparency
  22. 22. Workers • Lifecycle • Custom Qualifications / Training • Master Workers / Premium Qualifications
  23. 23. Community
  24. 24. Community - Retention is key - Finding the leaders - Worker enablement - Help Workers improve - We do: video tutorials, community forum, clear rules, automated training, re-training - Ask them what they need! - Listen to complaints - Add a comment box to your tasks to collect feedback - NPS
  25. 25. Community - Handling Workers that you don’t want doing your tasks - Rejecting - Qualifications - Blocking - Finding spammers and cheaters - Join the external forums - Your reputation matters
  26. 26. Intermediate
  27. 27. Hits - HITType - HIT - Assignments - Notifications HITType HIT Assignment Assignment Assignment Assignment HIT Assignment Assignment Assignment Assignment HIT Assignment Assignment Assignment Assignment Notification: Reviewable
  28. 28. Useful API operations CreateHIT Create new tasks for Workers to do. GetAccountBalance Check the funding available for publishing new tasks. RevokeQualification / GrantQualification Modify the Qualifications assigned to Workers. ForceExpireHIT Immediately remove a HIT from MTurk. GetAssignment The status and results from an Assignment. NotifyWorkers Send a message to your Workers. GrantBonus Provide a bonus payment to Workers. Use the Sandbox environment to experiment with creating and responding to HITs without spending money.
  29. 29. Question types • QuestionForm – XML defined questions. • HTMLQuestion – HTML form based questions. • ExternalQuestion – Questions hosted on your own website.
  30. 30. Review Policies - Review Policies can be specified in your CreateHIT call to automatically evaluate Worker submissions. - Assignment-level policies can be used to validate Worker responses to known answers. - HIT-level policies look for consensus amongst Workers on each HIT. B B C B C B B B • Imagine you want to ask six Workers and get 75% agreement. • If two Workers disagree, the policy will add additional Assignments until there is agreement.
  31. 31. How Rainforest QA Uses Mechanical Turk
  32. 32. Write tests, in plain English
  33. 33. Automatically trained testers • Fully automated training • Course + class-based • Automatic re-training • Always expanding • Per-customer training, for special situations
  34. 34. Super fast
  35. 35. Human results
  36. 36. Accurate human results, ML / AI backed
  37. 37. Scaling
  38. 38. Scaling - Rainforest v1 • Initially linked jobs to HITs 1:1 • Balanced a list of HITs against an internal list of jobs • Constantly pulling on / off MTurk when jobs were added, cancelled, changed. Jobs HITs
  39. 39. Scaling - Rainforest v2 • Decoupled jobs from HITs • Balance list of HITs against an internal list of jobs • Qualifications, constantly pulling on / off MTurk Jobs HITs
  40. 40. Scaling - Rainforest v3 • Unbalanced job / HITs - no 1:1 ratio, allowing for more SEO and higher chance of workers finding us • Stopped using Qualifications Jobs HITs
  41. 41. Questions
  42. 42. Thank you!
  43. 43. Remember to complete your evaluations!

×