Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Turkalytics: Analytics for Human Computing Paul Heymann, Hector Garcia-Molina Department of Computer Science Stanford University Presented by Chen Liu 14/10/2011
  2. 2. Outline• Introduction• Interaction and Data Models• Implementation• Requester Usage• Results• Conclusion
  3. 3. Mechanical Turk Mechanical Turk is a crowdsourcing Internet marketplace in which people could use human intelligence to help with tasks that computers are unable to do.
  4. 4. https://www.mturk.com/
  5. 5. Workers get paid to answer
  6. 6. Requesters pay to ask stuff
  7. 7. Motivation• Pricing? Quality? Reputation?
  8. 8. Motivation• Pricing? Quality? Reputation?• We need DATA!
  9. 9. Motivation• Pricing? Quality? Reputation?• We need DATA!• This paper describes a prototype system which can be embedded across different human computation systems for gathering analytics. It also illustrates how its use and give some initial findings on observable worker behavior.
  10. 10. Outline• Introduction• Interaction and Data Models• Implementation• Results• Conclusion
  11. 11. Interaction Model• Different marketplace call for different interactions • Simple model(SPA) • Mechanical Turk extensions(SCRAP)
  12. 12. Simple ModelSearch-Preview-Accept(SPA)Start from here
  13. 13. Mechanical Turk ExtensionsSearch-Continue-RapidAccept-Accept-Preview(SCRAP)Start from here
  14. 14. Data Model
  15. 15. Data Model Activity Tables User Tables Task Tables
  16. 16. Implementation• Client-Side JavaScript• Log Server• Analysis Server
  17. 17. Implementation• Client-Side JavaScript A requester on Mechanical Turk usually creates a HIT(task) based on a URL. The URL corresponds to an HTML page with a form that the worker completes. Requesters add a small snippet of HTML to their HTML page to embed Turkalytics . This HTML in turn includes JavaScript code which tracks details about workers as they complete the HIT.
  18. 18. Implementation• Log Server The log server is an extremely simple web application built on Google’s App Engine. It receives logging events from clients running ta.js and saves them to a data store. In addition to saving the events themselves, the log server also records HTTP data like IP address, user agent, and referer.
  19. 19. Implementation• Analysis Server The analysis server periodically polls the log server for new events. These events are then inserted into a PostgreSQL database, where they are processed by a network of triggers.
  20. 20. Results
  21. 21. Results
  22. 22. Results
  23. 23. Results
  24. 24. Conclution -A tool for gathering data about workers completing human computation tasks. -As part of a broader system, in particular a system implementing the Human Processing model -Turkalytics enables both code sharing among systems (systems need not reimplement worker monitoring code) and data sharing among systems (requesters benefit from data gathered from other requesters). -The system was scalable to more than 100, 000 requests/day. We also verified previous demographic data about the Turk, and presented some findings about location and interaction that are unique to our tool.