Best Practices for Mechanical Turk

•Download as PPTX, PDF•

0 likes•476 views

AWS World Wide Public Sector Mechanical Turk workshop session 3, best practices for converting workflows to Mechanical Turk

Business

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Transformational Impact
of Cloud Labor
John Hoskins & Daniel Gray
jhoskins@amazon.com
djgray@amazon.com

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing Best Practices
amazon
web services

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing myths

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• It’s cheaper
– It’s actually more efficient
• It’s faster
– It’s actually more scalable
• It’s not accurate
– It’s actually more accurate
The Myths[ ]

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
5
Crowdsourcing Best Practices

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Consider the question carefully
– Workers answer what you ask
• Select your workers
– Perspective and skills vary
• Iterate and Optimize
– Adjust for optimal results

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
The question[ ]
7
You will get an answer to the question that
you ask. Focus on asking the right
question

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Choosing your workers[ ]
8
Workers are different– from language and
cultural differences – to varying skills. Test
and monitor.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Monitor and Improve[ ]
9
Monitor key metrics, adjust and measure
key attributes impact on those metrics.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Accuracy
– Know your current
• Throughput
– Understand both turnaround and scale requirements
• Cost
– Measure against a budget – as cost can impact the other two
Key Metrics[ ]
10
“Great service, Good food, Friendly staff – you can choose two”

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
11
Cost is impacted most by the efficiency of
the other two metrics.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
12
Error has two sources: human and
systematic. Isolating human error and
solving for systematic error gives a
better chance for long term success.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
13
After solving for systematic error
choosing the best workers and
monitoring those workers provides
the next step towards high accuracy
and lowering costs.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Throughput[ ]
14
Many factors impact throughput;
Reputation
Ergonomics
Clarity

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
http://www.mturk.com
15
John Hoskins, Amazon Mechanical Turk
hoskins@amazon.com
amazon
web services

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
16
Cost is impacted most by the efficiency of
the other two metrics.
Optimization of task and workers lowers
both the cost of getting it done and
adjudicating a result.

Viewers also liked

الموسوعه المصورة للاعجاز العلمي في القران الكريم3 Loai Awad

Amazon mechanical turk intro to govt partners v2John Hoskins

How Public Sector is using Mechanical TurkJohn Hoskins

Amazon mechanical turk intro to bpo's v3John Hoskins

الموسوعه المصورة للاعجاز العلمي في القران الكريم 2Loai Awad

Quran miracle-encycopediaالموسوعه العلميه في الاعجاز القراني Loai Awad

Revolucao francesa rosivaldo_f_moreira

Strength of materials Loai Awad

Viewers also liked (8)

الموسوعه المصورة للاعجاز العلمي في القران الكريم3

Amazon mechanical turk intro to govt partners v2

How Public Sector is using Mechanical Turk

Amazon mechanical turk intro to bpo's v3

الموسوعه المصورة للاعجاز العلمي في القران الكريم 2

Quran miracle-encycopediaالموسوعه العلميه في الاعجاز القراني

Revolucao francesa

Strength of materials

Recently uploaded

B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201

👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022

Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823

Dr. Admir Softic_ presentation_Green Club_ENG.pdfAdmir Softic

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage

Cracking the Cultural Competence Code.pptxWorkforce Group

Katrina Personal Brand Project and portfolio 1kcpayne

VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Value Proposition canvas- Customer needs and painsP&CO

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066

Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceDamini Dixit

Monthly Social Media Update April 2024 pptx.pptxAndy Lambert

Falcon Invoice Discounting platform in indiaFalcon Invoice Discounting

Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Sheetaleventcompany

Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888

MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo

A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan

Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823

Call Girls In Noida 959961⊹3876 Independent Escort Service Noidadlhescort

Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller

Recently uploaded (20)

B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx

👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...

Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...

Dr. Admir Softic_ presentation_Green Club_ENG.pdf

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...

Cracking the Cultural Competence Code.pptx

Katrina Personal Brand Project and portfolio 1

VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...

Value Proposition canvas- Customer needs and pains

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756

Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service

Monthly Social Media Update April 2024 pptx.pptx

Falcon Invoice Discounting platform in india

Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...

Call Girls In Panjim North Goa 9971646499 Genuine Service

MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL

A DAY IN THE LIFE OF A SALESMAN / WOMAN

Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore

Call Girls In Noida 959961⊹3876 Independent Escort Service Noida

Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...

Best Practices for Mechanical Turk

1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Transformational Impact of Cloud Labor John Hoskins & Daniel Gray jhoskins@amazon.com djgray@amazon.com

2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing Best Practices amazon web services

3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing myths

4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • It’s cheaper – It’s actually more efficient • It’s faster – It’s actually more scalable • It’s not accurate – It’s actually more accurate The Myths[ ]

5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 5 Crowdsourcing Best Practices

6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Consider the question carefully – Workers answer what you ask • Select your workers – Perspective and skills vary • Iterate and Optimize – Adjust for optimal results

7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 The question[ ] 7 You will get an answer to the question that you ask. Focus on asking the right question

8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Choosing your workers[ ] 8 Workers are different– from language and cultural differences – to varying skills. Test and monitor.

9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Monitor and Improve[ ] 9 Monitor key metrics, adjust and measure key attributes impact on those metrics.

10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Accuracy – Know your current • Throughput – Understand both turnaround and scale requirements • Cost – Measure against a budget – as cost can impact the other two Key Metrics[ ] 10 “Great service, Good food, Friendly staff – you can choose two”

11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 11 Cost is impacted most by the efficiency of the other two metrics.

12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 12 Error has two sources: human and systematic. Isolating human error and solving for systematic error gives a better chance for long term success.

13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 13 After solving for systematic error choosing the best workers and monitoring those workers provides the next step towards high accuracy and lowering costs.

14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Throughput[ ] 14 Many factors impact throughput; Reputation Ergonomics Clarity

15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 http://www.mturk.com 15 John Hoskins, Amazon Mechanical Turk hoskins@amazon.com amazon web services

16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 16 Cost is impacted most by the efficiency of the other two metrics. Optimization of task and workers lowers both the cost of getting it done and adjudicating a result.

17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Thank You

Editor's Notes

Welcome to the Crowdsourcing Best Practices portion of today’s workshop.
First I want to squelch some bias’ that seem to present themselves to enterprises new to crowdsourcing.
Everyone envisions third world workers doing tasks for 10% of normal costs – that’s not necessarily true. Task work will cost the same to do with a crowd as it will with other options. Where the savings come in is in the efficiency of the process – 100% utilization of human capital, no overhead, no fixed fees. These add up to large overall savings – don’t focus on getting the task done cheaply, focus on the process costs – that’s where the savings present themselves. It’s faster – yes, but it’s faster because it’s scalable to meet demand. Work is done in parallel – and at scalable levels. Creating an environment where large volumes of task can get done in shorter time due to the immediate availability of workers scaling to need Finally – prospects always say it can’t possibly be as accurate as in house experts – but experience shows when implemented with automation and best practices – it’s actually more accurate. Most internal workflows are measured by sampling, that doesn’t uncover the outliers, the exceptions, and is subject to sampling error. In fact many customers don’t really know the true accuracy of their current workflow . Automated crowdsourcing workflows provide a confidence score on every answer – giving you the metrics you need to measure and improve accuracy to maximum levels.
When people ask me how to think about crowdsourcing their workflows – or what should change in their thinking – I always come back to these three things. Consider the question – think about how you are disintegrating your work. Select the best workers for your work And constantly improve
So let’s talk about the question. At Amazon it is our belief that the better you disintegrate the steps in your workflow – the closer you get to a binary question – one right answer – the easier it is to crowdsource. The more the question requires context or interpretation – the more possibility you’ve created for error. Asking the right question – or series of questions is the foundation to a successful crowdsourced implementation. Sometimes what you think is one question – might actually be more than one.
Consider cultural context – is it important to your task, or can you define your task well enough to eliminate it. Also, don’t think in terms of skills like programming, or accounting – think in terms of skills like recognition for transcribing poor handwriting, or expressiveness for keywording, [tell story of me transcribing audio]. Establish the task type, so that workers can self select. Workers don’t like to be wrong – they’ll avoid tasks they aren’t good at. Then, from the pool of workers choosing your tasks – find the better ones.
Finally – establish results goals – key metrics, measure and iterate to improve.
What are the common key metrics – you might have additional ones – or different priorities, but these are common across our customers. And they are interrelated Accuracy – what are you getting today, and what do you need. Accuracy comes at a cost – so be realistic [story about customers often not knowing their true accuracy] Throughput – what are the process requirements – and what opportunity does improvement provide. Often the new found speed of retrieving information opens the door to process improvements not considered in the base ROI. [tell CPG story] Cost – think of cost differently as it impacts the other two, more judgments arrive at greater confidence, at greater overall task cost – higher rewards attract more workers, improve throughput – etc. Remember, savings come in the efficiencies – in some cases we’ve seen where the task cost was actually higher than internal sources – but the efficiencies and speed provided significant business impact, negating the extra spent on tasks.
I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount. Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money. Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.
Although it can all be attributed to humans making mistakes – isolating and correcting for the cause of the most common errors builds greater overall accuracy. Mistakes come in two forms – humans just making an error – human error, and what I’ve termed as systematic error (commonly called ????). Systematic error is typically caused by things like poor instructions, ambiguous data, unclear questions. By establishing a good sample workforce, Like Mechanical Turk Masters, you can begin to test and improve for systematic error. Look for outliers – large levels of disagreement, root cause the specific tasks to see if improvement can eliminate.
After solving for systematic error and having a clear picture of what to expect – you can now begin measuring your workers to see if some are better than others. Look for accuracy on known answers – using the known answer API, high levels of agreement with other workers with high gold standard scores – use that data to build a confidence score on each answer – establishing a key system metric to monitor.
Response times are impacted by many factors. Initially, you’re as new to the workers as they are to you. How you establish that brand can impact your long term throughput. Workers are looking for Requesters with clearly defined tasks that they know they can do accurately – that adjudicate fair and pay quickly. Think in terms of worker efficiencies – you are paying workers ultimately for their time – and doing things that allow them to be more efficient saves them time. Like something as simple as prepopulating a web search you want done. Finally, clarity of task impacts throughput – helping workers understand how you want the question answered – how to handle edge cases. All of this gives the worker greater confidence to answer the question correctly and avoid mistakes – thereby improving their desire to do the tasks.
I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount. Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money. Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.

Best Practices for Mechanical Turk

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Recently uploaded

Recently uploaded (20)

Best Practices for Mechanical Turk

Editor's Notes