Turkalytics: Analytics for Human Computing Paul Heymann, Hector Garcia-Molina Department of Computer Science Stanford University Presented by Chen Liu 14/10/2011
Outline• Introduction• Interaction and Data Models• Implementation• Requester Usage• Results• Conclusion
Mechanical Turk Mechanical Turk is a crowdsourcing Internet marketplace in which people could use human intelligence to help with tasks that computers are unable to do.
Workers get paid to answer
Requesters pay to ask stuff
Motivation• Pricing? Quality? Reputation?
Motivation• Pricing? Quality? Reputation?• We need DATA!
Motivation• Pricing? Quality? Reputation?• We need DATA!• This paper describes a prototype system which can be embedded across different human computation systems for gathering analytics. It also illustrates how its use and give some initial findings on observable worker behavior.
Outline• Introduction• Interaction and Data Models• Implementation• Results• Conclusion
Interaction Model• Different marketplace call for different interactions • Simple model(SPA) • Mechanical Turk extensions(SCRAP)
Simple ModelSearch-Preview-Accept(SPA)Start from here
Mechanical Turk ExtensionsSearch-Continue-RapidAccept-Accept-Preview(SCRAP)Start from here
Data Model Activity Tables User Tables Task Tables
Implementation• Log Server The log server is an extremely simple web application built on Google’s App Engine. It receives logging events from clients running ta.js and saves them to a data store. In addition to saving the events themselves, the log server also records HTTP data like IP address, user agent, and referer.
Implementation• Analysis Server The analysis server periodically polls the log server for new events. These events are then inserted into a PostgreSQL database, where they are processed by a network of triggers.
Conclution -A tool for gathering data about workers completing human computation tasks. -As part of a broader system, in particular a system implementing the Human Processing model -Turkalytics enables both code sharing among systems (systems need not reimplement worker monitoring code) and data sharing among systems (requesters beneﬁt from data gathered from other requesters). -The system was scalable to more than 100, 000 requests/day. We also veriﬁed previous demographic data about the Turk, and presented some ﬁndings about location and interaction that are unique to our tool.