Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Amazon 
Mechanical 
Turk 
Hands-­‐on 
session 
Maribel 
Acosta
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
2
MTurk 
Basic 
Concepts 
(1) 
Requester 
Worker 
Source: 
h?ps://requester.mturk.com/tour/how_it_works 
• Requester: 
creat...
MTurk 
Basic 
Concepts 
HIT 
(2) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
• Project: 
HIT 
HTML 
+ 
HIT 
m...
MTurk 
Basic 
Concepts 
(3) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project...
MTurk 
Basic 
Concepts 
(4) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
HIT 
1 
HIT 
HIT 
1 
Assig.1 
HIT 
1 
Assig.2 
Assignm...
MTurk 
Basic 
Concepts 
(5) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
HIT 
1 
HIT 
Amazon 
Mechanical 
Turk 
hands-­‐on 
ses...
MTurk 
Basic 
Concepts 
(6) 
Example 
of 
Human 
Intelligence 
Tasks 
(HITs) 
• Projects 
can 
be 
broken 
into 
smaller 
...
MTurk 
Basic 
Concepts 
(7) 
Example 
of 
Human 
Intelligence 
Tasks 
(HITs) 
• Projects 
can 
be 
broken 
into 
smaller 
...
MTurk 
Basic 
Concepts 
(8) 
When 
creaQng 
a 
project 
or 
individual 
HITs, 
the 
HIT 
properCes 
must 
be 
specified: 
...
MTurk 
Workflow 
for 
Requesters 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
CreaCon 
& 
Design 
HIT 
Test 
(...
MTurk 
Sandbox 
The 
Sandbox 
is 
a 
simulated 
MTurk 
environment 
to 
test 
HITs. 
• Log 
in 
as 
requester: 
preview 
a...
Managing 
HITs 
in 
MTurk 
There 
are 
three 
different 
mechanism 
to 
manage 
your 
HITs 
in 
MTurk: 
Amazon 
Mechanical...
MTURK 
WEB 
INTERFACE 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
14
Hands 
On! 
• Project: 
Crowdsourcing 
DBpedia 
triples 
to 
verify 
the 
links 
to 
external 
web 
pages 
prefix 
dbpedia...
Hands 
On! 
• Go 
to 
Mturk 
Sandbox 
as 
a 
requester: 
– h?ps://requestersandbox.mturk.com/ 
• Click 
on 
Sign 
In 
– Em...
1. 
Creating 
a 
Project 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Different 
predefined 
templates: 
Select 
“other...
2. 
Setting 
up 
the 
HIT 
Properties 
(1) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
HIT 
descripQon 
18
2. 
Setting 
up 
the 
HIT 
Properties 
(2) 
Very 
IMPORTANT: 
Set 
up 
quality 
mechanisms 
Masters 
are 
selected 
by 
de...
3. 
Selecting 
Qualifications 
Worker 
requirements 
(filters) 
Very 
IMPORTANT: 
Set 
up 
quality 
mechanisms 
Masters 
a...
4. 
Defining 
the 
Task 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
WYSIWYG 
HTML 
editor 
21
4. 
Defining 
the 
Task 
(with 
Variables) 
Template: 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
elements 
that 
stay...
5. 
Previewing 
the 
Template 
This 
is 
what 
the 
workers 
will 
see 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
23 ...
6. 
Creating 
Batches 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
24
7. 
Previewing 
the 
HITs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Variables 
are 
replaced 
with 
the 
data 
from ...
8. 
Publishing 
the 
HITs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Summary 
of 
the 
project: 
• # 
of 
HITs 
• Rew...
9. 
Retrieving 
the 
Results 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
27
MTURK 
SUMMARY 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
28
Project/HIT 
Creation 
& 
Design 
(1) 
• The 
requester 
is 
able 
to 
create 
projects 
or 
individual 
HITs 
• Build 
us...
Project/HIT 
Creation 
& 
Design 
(2) 
• SelecQon 
of 
MTurk 
quality 
control 
mechanisms: 
• These 
• High 
quality 
wor...
HIT 
Test 
• Best 
pracCce: 
Always 
test 
your 
HITs 
before 
publishing 
them 
1. Perform 
technical 
tests 
(both 
as 
...
Run 
live 
HITs 
• HIT 
publicaCon: 
– Make 
the 
HITs 
available 
to 
the 
workers 
• Review 
the 
results: 
– Monitor 
t...
Lessons 
Learned 
• Introduce 
yourself 
on 
Worker 
forums 
(regular 
requester) 
• Be 
responsive 
to 
workers 
– Reply ...
Choosing 
the 
Right 
Tool 
Source: 
h?ps://requestersandbox.mturk.com/tour/choose_the_right_tool 
Amazon 
Mechanical 
Tur...
IS 
THERE 
MORE 
THAN 
MTURK? 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
35
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
36
CrowdFlower 
Platform 
CrowdFlower 
• Client: 
CrowdFlower 
contributors 
Creates 
and 
submits 
jobs 
(MTurk 
= 
requeste...
Why 
CrowdFlower? 
(1) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
38 
Neat 
UI
Why 
CrowdFlower? 
(2) 
Quality 
control 
mechanisms 
Allows 
for 
easily 
creaQng 
a 
“gold 
standard”, 
which 
is 
furth...
Why 
CrowdFlower? 
(3) 
Report 
generaCon 
and 
analyCcs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
40
Why 
NOT 
CrowdFlower? 
• At 
the 
beginning, 
clients 
must 
wait 
unQl 
their 
projects 
are 
approved 
by 
the 
CrowdFl...
References 
• AMT. 
Geyng 
Started 
Guide. 
API 
Version 
2012-­‐03-­‐25 
h?p://s3.amazonaws.com/awsdocs/MechTurk/latest/a...
Upcoming SlideShare
Loading in …5
×

Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014

944 views

Published on

  • Be the first to comment

Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014

  1. 1. Amazon Mechanical Turk Hands-­‐on session Maribel Acosta
  2. 2. Amazon Mechanical Turk hands-­‐on session 2
  3. 3. MTurk Basic Concepts (1) Requester Worker Source: h?ps://requester.mturk.com/tour/how_it_works • Requester: creates and submits tasks to the pla9orm. • Worker: person who solves the tasks. • Human Intelligence Task (HIT): work unit. Amazon Mechanical Turk hands-­‐on session 3
  4. 4. MTurk Basic Concepts HIT (2) Amazon Mechanical Turk hands-­‐on session Project • Project: HIT HTML + HIT metadata • The elements that stay the same in every HIT are denominated template • The data that will vary from HIT to HIT are specified via variables • NOTE: If no variables are specified in the project, we will create a single HIT • Variables: allow creaQng several HITs in the project 4 HIT 4 3 HIT 2 HIT 1
  5. 5. MTurk Basic Concepts (3) HIT 4 HIT 3 HIT 2 HIT 1 Batch Amazon Mechanical Turk hands-­‐on session Project • Batch: Group of HITs created by instanQaQng the variable(s) of a project • The values of the variables are specified in (CSV, TSV) files: • Each column corresponds to a variable • Each row is an instance -­‐> HIT • Each file corresponds to a batch • We can create several batches for the same project 5
  6. 6. MTurk Basic Concepts (4) HIT 4 HIT 3 HIT 2 HIT 1 Batch HIT 1 HIT HIT 1 Assig.1 HIT 1 Assig.2 Assignments Amazon Mechanical Turk hands-­‐on session Project Q1 Q2 QuesCons • HIT: HIT 1 Assig.3 Work unit. The same HIT can be solved by 1 or more workers (assignments) • Assignment: How many workers should solve one exact same HIT • QuesCons: A single HIT may contain one or several quesQons 6
  7. 7. MTurk Basic Concepts (5) HIT 4 HIT 3 HIT 2 HIT 1 Batch HIT 1 HIT Amazon Mechanical Turk hands-­‐on session Project Q1 Q2 QuesCons HIT 1 Assig.1 HIT 1 Assig.2 HI T1 Assig.3 Assignments Total cost of the project = No. of HITs x No. of Assignments x (Reward per HIT + Fee) 7
  8. 8. MTurk Basic Concepts (6) Example of Human Intelligence Tasks (HITs) • Projects can be broken into smaller tasks called HITs • A HIT represents a single work unit Tagging (describing) 900 images Amazon Mechanical Turk hands-­‐on session Project: Create tags for image X1 HIT: Create tags for image X2 HIT: … No. of HITS = 900 8
  9. 9. MTurk Basic Concepts (7) Example of Human Intelligence Tasks (HITs) • Projects can be broken into smaller tasks called HITs • A HIT represents a single work unit Tagging (describing) 900 images Amazon Mechanical Turk hands-­‐on session Project: Create tags for images X1, X2, X3 HIT: Create tags for image X4, X5, X6 HIT: … Several quesQons in a single HIT! No. of HITS = 300 9
  10. 10. MTurk Basic Concepts (8) When creaQng a project or individual HITs, the HIT properCes must be specified: • General informaQon: includes the Qtle and descripQon of the HIT, as well as keywords which are used by worker for searching HITs • HIT duraQon Qme: Qme allo?ed to solve the HIT (before it is given to another worker) • HIT life Qme: how long will the HIT be available on the pla9orm • # Assignments: number of different persons that will perform the exact same HIT • Reward: payment for correctly solving each assignment Amazon Mechanical Turk hands-­‐on session 10
  11. 11. MTurk Workflow for Requesters Amazon Mechanical Turk hands-­‐on session Project CreaCon & Design HIT Test (Sandbox) HIT PublicaQon Workers solve the HITs Review of the results Completed project ProducCon site 11 reject all assignments accepted
  12. 12. MTurk Sandbox The Sandbox is a simulated MTurk environment to test HITs. • Log in as requester: preview and test the interface of your HITs – h?ps://requestersandbox.mturk.com • Log in as worker: solve your own HITs to test their funcQonaliQes and result output – h?ps://workersandbox.mturk.com • Best pracCce: Always test your HITs (as requester and worker) before publishing them in the producQon site Amazon Mechanical Turk hands-­‐on session 12
  13. 13. Managing HITs in MTurk There are three different mechanism to manage your HITs in MTurk: Amazon Mechanical Turk hands-­‐on session API Command Line Tools Web Interface 13
  14. 14. MTURK WEB INTERFACE Amazon Mechanical Turk hands-­‐on session 14
  15. 15. Hands On! • Project: Crowdsourcing DBpedia triples to verify the links to external web pages prefix dbpedia-­‐ont:<http://dbpedia.org/ontology/> prefix foaf:<http://xmlns.com/foaf/0.1/> SELECT * WHERE { ?s dbpedia-­‐ont:wikiPageExternalLink ?o; foaf:name ?s_name; foaf:isPrimaryTopicOf ?s_wikipage . Amazon Mechanical Turk hands-­‐on session 15 } LIMIT 200 Triple to crowdsource Triples to build the UI http://dbpedia.org/sparql MTurkDemo/data/sparql.csv
  16. 16. Hands On! • Go to Mturk Sandbox as a requester: – h?ps://requestersandbox.mturk.com/ • Click on Sign In – Email address: own_tp@gmx.li – Password: sourcrowd • Now we are at “home” Amazon Mechanical Turk hands-­‐on session 16
  17. 17. 1. Creating a Project Amazon Mechanical Turk hands-­‐on session Different predefined templates: Select “other” 17
  18. 18. 2. Setting up the HIT Properties (1) Amazon Mechanical Turk hands-­‐on session HIT descripQon 18
  19. 19. 2. Setting up the HIT Properties (2) Very IMPORTANT: Set up quality mechanisms Masters are selected by default Amazon Mechanical Turk hands-­‐on session HIT properQes 19
  20. 20. 3. Selecting Qualifications Worker requirements (filters) Very IMPORTANT: Set up quality mechanisms Masters are selected by default HIT properQes Amazon Mechanical Turk hands-­‐on session 20 Masters expect higher rewards MTurk charges 20% for masters
  21. 21. 4. Defining the Task Amazon Mechanical Turk hands-­‐on session WYSIWYG HTML editor 21
  22. 22. 4. Defining the Task (with Variables) Template: Amazon Mechanical Turk hands-­‐on session elements that stay the same in every HIT Variables: data that will vary from HIT to HIT. Are denoted as follows: ${var_name} 22
  23. 23. 5. Previewing the Template This is what the workers will see Amazon Mechanical Turk hands-­‐on session 23 The variables will be replaced by the input data
  24. 24. 6. Creating Batches Amazon Mechanical Turk hands-­‐on session 24
  25. 25. 7. Previewing the HITs Amazon Mechanical Turk hands-­‐on session Variables are replaced with the data from the input file 25
  26. 26. 8. Publishing the HITs Amazon Mechanical Turk hands-­‐on session Summary of the project: • # of HITs • Rewards • Total payment • Account balance 26
  27. 27. 9. Retrieving the Results Amazon Mechanical Turk hands-­‐on session 27
  28. 28. MTURK SUMMARY Amazon Mechanical Turk hands-­‐on session 28
  29. 29. Project/HIT Creation & Design (1) • The requester is able to create projects or individual HITs • Build user-­‐friendly interfaces (using web technologies) • Then, the HIT properCes must be specified: – General informaQon: includes the Qtle and descripQon of the HIT, as well as keywords which are used by worker for searching HITs. – HIT duraQon Qme: Qme allo?ed to solve the HIT (before it is given to another worker). – HIT life Qme: how long will the HIT be available on the pla9orm. – # Assignments: number of different persons that will perform the same HIT. – Reward: payment for correctly solving each assignment. Amazon Mechanical Turk hands-­‐on session 29
  30. 30. Project/HIT Creation & Design (2) • SelecQon of MTurk quality control mechanisms: • These • High quality workers • Masters • Photo moderaQon masters • CategorizaQon masters • System qualificaCons • LocaQon by country • HIT submission rate (%) • HIT approval/rejecQon rate (%) • (Absolute) Number of HITs approved • QualificaCon types • Simply granted or a?ributed via customized tests filters are automaQcally performed by the pla9orm Amazon Mechanical Turk hands-­‐on session Worker requirements Masters expect higher rewards MTurk charges 20% for masters 30
  31. 31. HIT Test • Best pracCce: Always test your HITs before publishing them 1. Perform technical tests (both as requester and worker) in the MTurk Sandbox environment. Source: h?ps://requester.mturk.com/developer/sandbox 2. Publish a small subset of tasks in the producQon site to test usability and responsiveness. Amazon Mechanical Turk hands-­‐on session 31
  32. 32. Run live HITs • HIT publicaCon: – Make the HITs available to the workers • Review the results: – Monitor the submi?ed assignments constantly – Download the results – Accept/reject assignments, provide feedback when rejecQng – Block spammers (opQonal) • Update HIT/Project: – Extend/expire HITs or modify other HIT properQes – Add addiQonal assignments Amazon Mechanical Turk hands-­‐on session 32
  33. 33. Lessons Learned • Introduce yourself on Worker forums (regular requester) • Be responsive to workers – Reply to emails with quesQons about tasks • Use monitoring tools: – Forums – TurkopQcon (Source:h?p://turkopQcon.differenceengines.com/)
  34. 34. Choosing the Right Tool Source: h?ps://requestersandbox.mturk.com/tour/choose_the_right_tool Amazon Mechanical Turk hands-­‐on session 34
  35. 35. IS THERE MORE THAN MTURK? Amazon Mechanical Turk hands-­‐on session 35
  36. 36. Amazon Mechanical Turk hands-­‐on session 36
  37. 37. CrowdFlower Platform CrowdFlower • Client: CrowdFlower contributors Creates and submits jobs (MTurk = requester) • Contributor: person who solves the jobs (MTurk = worker) • Job: unit work (MTurk = task) Amazon Mechanical Turk hands-­‐on session 37 client
  38. 38. Why CrowdFlower? (1) Amazon Mechanical Turk hands-­‐on session 38 Neat UI
  39. 39. Why CrowdFlower? (2) Quality control mechanisms Allows for easily creaQng a “gold standard”, which is further used to detect low quality workers Amazon Mechanical Turk hands-­‐on session 39
  40. 40. Why CrowdFlower? (3) Report generaCon and analyCcs Amazon Mechanical Turk hands-­‐on session 40
  41. 41. Why NOT CrowdFlower? • At the beginning, clients must wait unQl their projects are approved by the CrowdFlower staff before publishing them – Wait Qme: From a couple of hours up to (5)* days • Jobs must be specified in a non-­‐standard language: – CML: CrowdFlower Markup Language • There are certain configuraCons that cannot be executed in the pla9orm *Personal experience of the presenter Amazon Mechanical Turk hands-­‐on session 41
  42. 42. References • AMT. Geyng Started Guide. API Version 2012-­‐03-­‐25 h?p://s3.amazonaws.com/awsdocs/MechTurk/latest/amt-­‐gsg.pdf • The Mechanical Turk Blog h?p://mechanicalturk.typepad.com/ • MTurk Java API h?p://people.csail.mit.edu/gli?le/MTurkJavaAPI/ • CrowdFlower Pla9orm h?p://crowdflower.com Amazon Mechanical Turk hands-­‐on session 42

×