Getting Started with Mechanical Turk
Emily Tucker Prud’hommeaux
June 15, 2010
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your t...
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your t...
Mechanical Turk, a.k.a Mturk
What is Mechanical Turk?
• Then: A chess-playing “robot” -- actually a guy in a box.
• Now: A...
Mechanical Turk Terminology
• Requester: You, the person asking the
questions.
• Workers (or Turkers): The people answerin...
MTurk vs. Traditional Methods
Mechanical Turk Traditional Methods
Many workers answer a few
questions in a short period.
F...
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your t...
Creating Your Accounts
1. Create an Amazon Mechanical Turk Requester
account. You need this to use Mechanical Turk.
https:...
Funding Your Account
Funding Your Account
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your ...
Creating a HIT
1. Click the Design tab
Select a Template
Letʼs try Data Collection
2. Select a HIT template.
Enter Properties
Don’t give people too much time
Other criteria can be helpful
(e.g., must live in US). Amazon
displays yo...
Design Layout
Click here to edit the HTML.
Ah, much better!
Design Layout
Input data variables. You’ll
upload a CSV file containing their
values. Format them this way and
MTurk will i...
Preview and Finish
Recall: we will upload a CSV file
to fill in these blanks for each HIT.
Publishing Your HIT
Create and Upload CSV File
You create the CSV file on your computer and upload it here. It will look
something like this fo...
Preview your HIT
The ${name}, ${phone}, and
${address} variables got
filled with the values from
your CSV file.
Confirm and Publish
Manage HITs and Results
Review and Download Results
Approve or reject that worker’s work.
Download results to your computer.
You get to process yo...
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Designing your t...
Including Audio without Flash
• For audio, you can convert your wavs to mp3, put them on
the web, have the links to the mp...
Including Audio with Flash
• If you donʼt want the audio to open in a new window,
embed the audio in a Flash player.
• I u...
Including Video
• For videos, I have been using Flash.
• Flash works reliably in all browsers (when it doesnʼt crash
them ...
Video with Flash: Preparation
1. Convert your videos to .flv format. I have used FLVCrunch:
http://download.cnet.com/FLV-Cr...
Video with Flash: MTurk Part
4. Include your videos as variables in your CSV file like this:
video1,video2
http://www.csee....
Outline
1. Overview of Mechanical Turk concept.
2. Creating and funding your account.
3. Using the GUI.
• Setting up your ...
Command Line Tools: Why?
Instead of using the GUI to set up your MTurk experiment,
you can use command line tools.
Advanta...
Command Line Tools: Basics
1. Download and install command line tools from here:
http://developer.amazonwebservices.com/co...
Command Line Tools: Documentation
There is some good documentation for the Mechanical Turk
command line tools:
1. The User...
External Pages
• Get started using the samples/external_page directory
in your command line tools installation.
-rw-r--r--...
Data Files
external_hit.input
external_hit.properties
external_hit.question
external_hit.question
http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&sent1=${helper.urlencode($sent1)...
The External Page
Needs to have a few important things:
• Javascript (or other) code for extracting the values of your
inp...
External Web Page:
Javascript code for extracting
URL parameters.
External Web Page:
Javascript code for using
extracted URL parameters.
This part is very important! The worker must accept...
Command Line Tools: Sandbox
• Good idea to try out your experiments in the sandbox.
• Sandbox lets you see exactly how you...
Lots of Other Topics
• Using command line tools to interact more closely with
workers, design ways of determining who is a...
Upcoming SlideShare
Loading in...5
×

Getting Started with Mechanical Turk

356

Published on

Emily Tucker Prud’hommeaux's useful presentation!

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
356
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Getting Started with Mechanical Turk

  1. 1. Getting Started with Mechanical Turk Emily Tucker Prud’hommeaux June 15, 2010
  2. 2. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools: 6. Getting fancy with the command line: external pages.
  3. 3. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  4. 4. Mechanical Turk, a.k.a Mturk What is Mechanical Turk? • Then: A chess-playing “robot” -- actually a guy in a box. • Now: A service run by Amazon.com that allows people worldwide to do work or answer questions for you.
  5. 5. Mechanical Turk Terminology • Requester: You, the person asking the questions. • Workers (or Turkers): The people answering your questions. • Human Intelligence Task (HIT): The question or set of questions you want them to answer. • Reward: How much you pay a Worker for a HIT.
  6. 6. MTurk vs. Traditional Methods Mechanical Turk Traditional Methods Many workers answer a few questions in a short period. Few subjects answer lots of questions over a long period. Not a lot of interaction -- may be hard to explain task. Tons of interaction -- lots of opportunity to explain things. Who are these people?!? You know your subjects. Very cheap, and you don’t have to pay if they do a bad job. Not so cheap, and you have to pay the people anyway. Quality control is tricky. Quality control is not so hard. Less opportunity for bias on the part of the experimenter. More opportunity for bias.
  7. 7. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  8. 8. Creating Your Accounts 1. Create an Amazon Mechanical Turk Requester account. You need this to use Mechanical Turk. https://requester.mturk.com/mturk/beginsignin 2. (Optional) Create an Amazon Web Services (AWS) account. You need this to be able use the command line tools and possibly for some other things: https://aws-portal.amazon.com/gp/aws/developer/registration/index.html
  9. 9. Funding Your Account
  10. 10. Funding Your Account
  11. 11. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Setting up your first experiment. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  12. 12. Creating a HIT 1. Click the Design tab
  13. 13. Select a Template Letʼs try Data Collection 2. Select a HIT template.
  14. 14. Enter Properties Don’t give people too much time Other criteria can be helpful (e.g., must live in US). Amazon displays your HIT only to the people who meet the criteria. Reward: usually just a few cents, unless it’s really long. Be brief but descriptive.
  15. 15. Design Layout Click here to edit the HTML. Ah, much better!
  16. 16. Design Layout Input data variables. You’ll upload a CSV file containing their values. Format them this way and MTurk will interpret them for you. This is how worker responses get stored, just like a regular old HTML form, which you already know all about. Hint: If you want some specific type of HTML form input (e.g., radio buttons, drop down menu, checkbox), look at the Blank Template template.
  17. 17. Preview and Finish Recall: we will upload a CSV file to fill in these blanks for each HIT.
  18. 18. Publishing Your HIT
  19. 19. Create and Upload CSV File You create the CSV file on your computer and upload it here. It will look something like this for this example. name,address,phone Bread and Ink,3600 SE Hawthorne,503-555-1212 Three Doors Down,1415 SE 38th,503-555-1213 Cha cha cha!,3375 SE Hawthorne,503-555-1214
  20. 20. Preview your HIT The ${name}, ${phone}, and ${address} variables got filled with the values from your CSV file.
  21. 21. Confirm and Publish
  22. 22. Manage HITs and Results
  23. 23. Review and Download Results Approve or reject that worker’s work. Download results to your computer. You get to process your results file however you like -- open it in Excel or write a program to make it look nice.
  24. 24. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Designing your tasks. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  25. 25. Including Audio without Flash • For audio, you can convert your wavs to mp3, put them on the web, have the links to the mp3s be your variables in the CSV file, then force the links to open in a new window. • If you want something more reliable, embed the audio in a Flash player, which I am about to describe. • If you need more control (e.g., you want to prevent the worker from listening to the wave more than once), you might need to use something fancier like Javascript. audiofile1,audiofile2 http://etucker.com/a1.mp3,http://etucker.com/a2.mp3 CSV file <a target="_blank" href="${audiofi1e1}>Audio1</a> Template HTML
  26. 26. Including Audio with Flash • If you donʼt want the audio to open in a new window, embed the audio in a Flash player. • I use the Google audio Flash player, which works well and has nice controls. • The html will look something like this: <embed src="http://www.google.com/reader/ui/3523697345-audio-player.swf" flashvars="audioUrl=${audiofile}" width="400" height="27" quality="best" type="application/x-shockwave-flash"></embed> • The input file will look something like this: audiofile http://www.csee.ogi.edu/mechturk/audio1.mp3 http://www.csee.ogi.edu/mechturk/audio2.mp3 http://www.csee.ogi.edu/mechturk/audio3.mp3 http://www.csee.ogi.edu/mechturk/audio4.mp3
  27. 27. Including Video • For videos, I have been using Flash. • Flash works reliably in all browsers (when it doesnʼt crash them or take up the whole CPU) and everyone has it. • If a lot of Workers start using iPads, this might not be a good solution. • Itʼs all super easy, so why am I presenting this? • Because it took me so long to find the best tools and figure out the best way to do the HTML so that it would work in MTurk and in all browsers.
  28. 28. Video with Flash: Preparation 1. Convert your videos to .flv format. I have used FLVCrunch: http://download.cnet.com/FLV-Crunch/3000-2194_4-10909295.html 2. Get a Flash player. I have used the free JW Player: http://www.longtailvideo.com/players/jw-flv-player 3. Put both the player components (as described in the JW Player instructions) and your .flv videos on the internet somewhere. Sean created this directory for me on the csee.ogi.edu servers: /vol0/projects/www/CSE/public_html_noredirect/mech which you can access on the web with this URL: http://www.csee.ogi.edu/mech
  29. 29. Video with Flash: MTurk Part 4. Include your videos as variables in your CSV file like this: video1,video2 http://www.csee.ogi.edu/mech/player.swf?file=http:// www.csee.ogi.edu/mech/video/myawesomevideo1.flv,http:// www.csee.ogi.edu/mech/player.swf?file=http://www.csee.ogi.edu/ mech/video/myawesomevideo2.flv 5. In the template for your hit, include a line like this for each video you want to include in that hit: <embed height="300" width="300" src="${video1}" name="player1" id="player1"></embed>
  30. 30. Outline 1. Overview of Mechanical Turk concept. 2. Creating and funding your account. 3. Using the GUI. • Setting up your first experiment. • Submitting your tasks. • Reviewing and approving your results. 4. Getting fancy with the GUI: audio and video. 5. Using the command line tools. 6. Getting fancy with the command line: external pages.
  31. 31. Command Line Tools: Why? Instead of using the GUI to set up your MTurk experiment, you can use command line tools. Advantages: 1. Approval/rejection process is easier when you have lots of data from lots of workers. 2. More power to manage workers: block workers, set qualifications for workers. 3. Possible to change properties for HIT already in progress. 4. Can use the sandbox to try out your experiments. 5. With external pages, much more flexibility in what kind of web stuff you can do, like Javascript.
  32. 32. Command Line Tools: Basics 1. Download and install command line tools from here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=694 2. Sign up for an AWS account, if you didnʼt before: https://aws-portal.amazon.com/gp/aws/developer/registration/index.html 3. Associate your installation with your AWS identifiers a) Find your identifiers: http://s3.amazonaws.com/mturk/tools/pages/aws-access-identifiers/aws- identifier.html b) Enter those identifiers in bin/mturk.properties file: access_key=[Your AWS Access Key] secret_key=[Your Secret Key]
  33. 33. Command Line Tools: Documentation There is some good documentation for the Mechanical Turk command line tools: 1. The UserGuide.html that comes with the tools: definitely use it to get started with everything. 2. The samples directory: • Anything youʼd like to do with the command line tools is pretty easy to figure out just by copying the samples... • ...except setting up an external page, which is poorly documented, which is why that is our next topic.
  34. 34. External Pages • Get started using the samples/external_page directory in your command line tools installation. -rw-r--r-- 1 emtucker emtucker 119 Apr 24 2008 external_hit.input -rw-r--r-- 1 emtucker emtucker 619 Apr 24 2008 external_hit.properties -rw-r--r-- 1 emtucker emtucker 621 Feb 8 22:59 external_hit.question -rw-r--r-- 1 emtucker emtucker 2218 Apr 24 2008 externalpage.htm -rwxr-xr-x 1 emtucker emtucker 667 Apr 24 2008 approveAndDeleteResults.sh -rwxr-xr-x 1 emtucker emtucker 705 Apr 24 2008 getResults.sh -rwxr-xr-x 1 emtucker emtucker 671 Apr 24 2008 reviewResults.sh -rwxr-xr-x 1 emtucker emtucker 799 Apr 24 2008 run.sh external_hit.input This is like the input file you used with the GUI, but tab separated instead of comma separated. external_hit.properties Title, description, reward, qualifications, time allotted, what your input variables are called. external_hit.question Link to external page plus how to get your input variables into your page. More on this shortly. externalpage.html The external web page itself. More on this shortly *.sh All of the pre-written scripts for submitting your HITs, downloading the results, and approving/ rejecting the work.
  35. 35. Data Files external_hit.input external_hit.properties external_hit.question
  36. 36. external_hit.question http://www.csee.ogi.edu/page.html?id1=${helper.urlencode($id1)}&amp;sent1=${helper.urlencode($sent1)} The URL to your external page, wherever you decide to put it. The helper.urlencode bit is how MTurk puts the values of your input variables (which it gets from the .input file) into the URL for the page for each HIT. Then, in your external web page, you’ll use Javascript (or something else of your choice) to read these items out of the URL in order to use them in your page where you need them. MTurk also automatically inserts the AssignmentID variable into the URL. That is, if a worker accepts the HIT, a unique Assignment ID will be created and included in the URL. You will have to use that information when you post the results to MTurk in your external page.
  37. 37. The External Page Needs to have a few important things: • Javascript (or other) code for extracting the values of your input variables out of the URL. • Javascript (or other) code for accessing the Assignment ID and for posting the workerʼs responses to MTurk. This is all included in the externalpage.htm file in the samples/external_page directory of the command line tools installation. The example external page is very helpful, but poorly commented.
  38. 38. External Web Page: Javascript code for extracting URL parameters.
  39. 39. External Web Page: Javascript code for using extracted URL parameters. This part is very important! The worker must accept the hit before being able to complete it. Be sure to include this (or something like it) in your external page.
  40. 40. Command Line Tools: Sandbox • Good idea to try out your experiments in the sandbox. • Sandbox lets you see exactly how your HIT will look to potential workers. 1. In your bin/mturk.properties file, comment out this line: #service_url=http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester and uncomment this line: service_url=http://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester 2. In your external html page, replace references to http://www.mturk.com/mturk/externalSubmit with http://workersandbox.mturk.com/mturk/externalSubmit
  41. 41. Lots of Other Topics • Using command line tools to interact more closely with workers, design ways of determining who is a good worker and recruiting those workers, banning specific workers. • Using the Amazon Mechanical Turk SDK. • Practical concerns: What kinds of projects can you do with Mechanical Turk? Are some projects better carried out with traditional methods? • How much money do we save using Mechanical Turk? Sometimes it might be cheaper and easier to use a few carefully chosen local workers, or even people currently employed at OGI.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×