AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Educati...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing Best Prac...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing myths
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• It’s cheaper
– It’s a...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
5
Crowdsourcing Best Pr...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Consider the question...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
The question[ ]
7
You w...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Choosing your workers[ ...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Monitor and Improve[ ]
...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Accuracy
– Know your ...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
11
Cost is impa...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
12
Error ha...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
13
After so...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Throughput[ ]
14
Many f...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
http://www.mturk.com
15...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
16
Cost is impa...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Educati...
Upcoming SlideShare
Loading in …5
×

Best Practices for Mechanical Turk

388 views
309 views

Published on

AWS World Wide Public Sector Mechanical Turk workshop session 3, best practices for converting workflows to Mechanical Turk

Published in: Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
388
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

  • Welcome to the Crowdsourcing Best Practices portion of today’s workshop.
  • First I want to squelch some bias’ that seem to present themselves to enterprises new to crowdsourcing.
  • Everyone envisions third world workers doing tasks for 10% of normal costs – that’s not necessarily true. Task work will cost the same to do with a crowd as it will with other options. Where the savings come in is in the efficiency of the process – 100% utilization of human capital, no overhead, no fixed fees. These add up to large overall savings – don’t focus on getting the task done cheaply, focus on the process costs – that’s where the savings present themselves.

    It’s faster – yes, but it’s faster because it’s scalable to meet demand. Work is done in parallel – and at scalable levels. Creating an environment where large volumes of task can get done in shorter time due to the immediate availability of workers scaling to need

    Finally – prospects always say it can’t possibly be as accurate as in house experts – but experience shows when implemented with automation and best practices – it’s actually more accurate. Most internal workflows are measured by sampling, that doesn’t uncover the outliers, the exceptions, and is subject to sampling error. In fact many customers don’t really know the true accuracy of their current workflow . Automated crowdsourcing workflows provide a confidence score on every answer – giving you the metrics you need to measure and improve accuracy to maximum levels.
  • When people ask me how to think about crowdsourcing their workflows – or what should change in their thinking – I always come back to these three things.
    Consider the question – think about how you are disintegrating your work.
    Select the best workers for your work
    And constantly improve
  • So let’s talk about the question. At Amazon it is our belief that the better you disintegrate the steps in your workflow – the closer you get to a binary question – one right answer – the easier it is to crowdsource. The more the question requires context or interpretation – the more possibility you’ve created for error. Asking the right question – or series of questions is the foundation to a successful crowdsourced implementation. Sometimes what you think is one question – might actually be more than one.
  • Consider cultural context – is it important to your task, or can you define your task well enough to eliminate it. Also, don’t think in terms of skills like programming, or accounting – think in terms of skills like recognition for transcribing poor handwriting, or expressiveness for keywording, [tell story of me transcribing audio]. Establish the task type, so that workers can self select. Workers don’t like to be wrong – they’ll avoid tasks they aren’t good at. Then, from the pool of workers choosing your tasks – find the better ones.
  • Finally – establish results goals – key metrics, measure and iterate to improve.
  • What are the common key metrics – you might have additional ones – or different priorities, but these are common across our customers. And they are interrelated
    Accuracy – what are you getting today, and what do you need. Accuracy comes at a cost – so be realistic [story about customers often not knowing their true accuracy]
    Throughput – what are the process requirements – and what opportunity does improvement provide. Often the new found speed of retrieving information opens the door to process improvements not considered in the base ROI. [tell CPG story]
    Cost – think of cost differently as it impacts the other two, more judgments arrive at greater confidence, at greater overall task cost – higher rewards attract more workers, improve throughput – etc. Remember, savings come in the efficiencies – in some cases we’ve seen where the task cost was actually higher than internal sources – but the efficiencies and speed provided significant business impact, negating the extra spent on tasks.
  • I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount.
    Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money.
    Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.
  • Although it can all be attributed to humans making mistakes – isolating and correcting for the cause of the most common errors builds greater overall accuracy. Mistakes come in two forms – humans just making an error – human error, and what I’ve termed as systematic error (commonly called ????). Systematic error is typically caused by things like poor instructions, ambiguous data, unclear questions. By establishing a good sample workforce, Like Mechanical Turk Masters, you can begin to test and improve for systematic error. Look for outliers – large levels of disagreement, root cause the specific tasks to see if improvement can eliminate.
  • After solving for systematic error and having a clear picture of what to expect – you can now begin measuring your workers to see if some are better than others. Look for accuracy on known answers – using the known answer API, high levels of agreement with other workers with high gold standard scores – use that data to build a confidence score on each answer – establishing a key system metric to monitor.
  • Response times are impacted by many factors. Initially, you’re as new to the workers as they are to you. How you establish that brand can impact your long term throughput. Workers are looking for Requesters with clearly defined tasks that they know they can do accurately – that adjudicate fair and pay quickly. Think in terms of worker efficiencies – you are paying workers ultimately for their time – and doing things that allow them to be more efficient saves them time. Like something as simple as prepopulating a web search you want done. Finally, clarity of task impacts throughput – helping workers understand how you want the question answered – how to handle edge cases. All of this gives the worker greater confidence to answer the question correctly and avoid mistakes – thereby improving their desire to do the tasks.
  • I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount.
    Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money.
    Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.
  • Best Practices for Mechanical Turk

    1. 1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Transformational Impact of Cloud Labor John Hoskins & Daniel Gray jhoskins@amazon.com djgray@amazon.com
    2. 2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing Best Practices amazon web services
    3. 3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing myths
    4. 4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • It’s cheaper – It’s actually more efficient • It’s faster – It’s actually more scalable • It’s not accurate – It’s actually more accurate The Myths[ ]
    5. 5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 5 Crowdsourcing Best Practices
    6. 6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Consider the question carefully – Workers answer what you ask • Select your workers – Perspective and skills vary • Iterate and Optimize – Adjust for optimal results
    7. 7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 The question[ ] 7 You will get an answer to the question that you ask. Focus on asking the right question
    8. 8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Choosing your workers[ ] 8 Workers are different– from language and cultural differences – to varying skills. Test and monitor.
    9. 9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Monitor and Improve[ ] 9 Monitor key metrics, adjust and measure key attributes impact on those metrics.
    10. 10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Accuracy – Know your current • Throughput – Understand both turnaround and scale requirements • Cost – Measure against a budget – as cost can impact the other two Key Metrics[ ] 10 “Great service, Good food, Friendly staff – you can choose two”
    11. 11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 11 Cost is impacted most by the efficiency of the other two metrics.
    12. 12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 12 Error has two sources: human and systematic. Isolating human error and solving for systematic error gives a better chance for long term success.
    13. 13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 13 After solving for systematic error choosing the best workers and monitoring those workers provides the next step towards high accuracy and lowering costs.
    14. 14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Throughput[ ] 14 Many factors impact throughput; Reputation Ergonomics Clarity
    15. 15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 http://www.mturk.com 15 John Hoskins, Amazon Mechanical Turk hoskins@amazon.com amazon web services
    16. 16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 16 Cost is impacted most by the efficiency of the other two metrics. Optimization of task and workers lowers both the cost of getting it done and adjudicating a result.
    17. 17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Thank You

    ×