Crowdsourcing using Mechanical
Turk for Human Computer
Interaction Research
Ed H. Chi
Research Scientist
Google
(work done while at [Xerox] PARC)
1
Historical Footnote
De Prony, 1794, hired hairdressers
• (unemployed after French revolution; knew only
addition and subtraction)
• to create logarithmic and trigonometric tables.
• He managed the process by splitting the
work into very detailed workflows.
!"#$% &'#(")$)*'%+ ,'"%-
• !"#$%/ 0121 )31
56'#(")12/+7 "/1-
#$)3 6'#(")$)*'%
– Grier, When computers were human, 2005
• !"#$%&'() 6'#(")
– &9$*2$")+ $/)2'%'#
&'#(")1- )31 !$9
6'#1) '2?*) @)3211
(2'?91#A )&*&)&%#
-$./" '4 %"#12*6
6'#(")$)*'%/ $62'
$/)2'%'#12/
C2*12+ D31% 6'#(")12/ 0
C2*12
2
Talk in 3 Acts
• Act 1:
– How we almost failed in using MTurk?!
– [Kittur, Chi, Suh, CHI2008]
• Act II:
– Apply MTurk to visualization evaluation
– [Kittur, Suh, Chi, CSCW2008]
• Act III:
– Where are the limits?
Aniket Kittur, Ed H. Chi, Bongwon Suh.
Crowdsourcing User Studies With Mechanical Turk. In CHI2008.
Aniket Kittur, Bongwon Suh, Ed H. Chi. Can You Ever Trust a Wiki?
Impacting Perceived Trustworthiness in Wikipedia. In CSCW2008.
3
Using Mechanical Turk for user studies
Traditional user Mechanical Turk
studies
Task complexity
Complex
Simple
Long
Short
Task subjectivity
Subjective
Objective
Opinions
Verifiable
User information
Targeted demographics
Unknown demographics
High interactivity
Limited interactivity
Can Mechanical Turk be usefully used for user studies?
5
Task
• Assess quality of Wikipedia articles
• Started with ratings from expert Wikipedians
– 14 articles (e.g., Germany , Noam Chomsky )
– 7-point scale
• Can we get matching ratings with mechanical turk?
6
Experiment 1
• Rate articles on 7-point scales:
– Well written
– Factually accurate
– Overall quality
• Free-text input:
– What improvements does the article need?
• Paid $0.05 each
7
Experiment 1: Good news
• 58 users made 210 ratings (15 per article)
– $10.50 total
• Fast results
– 44% within a day, 100% within two days
– Many completed within minutes
8
Experiment 1: Bad news
• Correlation between turkers and Wikipedians
only marginally significant (r=.50, p=.07)
• Worse, 59% potentially invalid responses
Experiment 1
Invalid 49%
comments
<1 min 31%
responses
• Nearly 75% of these done by only 8 users
9
Not a good start
• Summary of Experiment 1:
– Only marginal correlation with experts.
– Heavy gaming of the system by a minority
• Possible Response:
– Can make sure these gamers are not rewarded
– Ban them from doing your hits in the future
– Create a reputation system [Delores Lab]
• Can we change how we collect user input ?
10
Design changes
• Use verifiable questions to signal monitoring
– How many sections does the article have?
– How many images does the article have?
– How many references does the article have?
11
Design changes
• Use verifiable questions to signal monitoring
• Make malicious answers as high cost as good-faith
answers
– Provide 4-6 keywords that would give someone a
good summary of the contents of the article
12
Design changes
• Use verifiable questions to signal monitoring
• Make malicious answers as high cost as good-faith
answers
• Make verifiable answers useful for completing
task
– Used tasks similar to how Wikipedians evaluate quality
(organization, presentation, references)
13
Design changes
• Use verifiable questions to signal monitoring
• Make malicious answers as high cost as good-faith
answers
• Make verifiable answers useful for completing
task
• Put verifiable tasks before subjective responses
– First do objective tasks and summarization
– Only then evaluate subjective quality
– Ecological validity?
14
Experiment 2: Results
• 124 users provided 277 ratings (~20 per article)
• Significant positive correlation with Wikipedians
– r=.66, p=.01
• Smaller proportion malicious responses
• Increased time on task
Experiment 1
Experiment 2
Invalid 49%
3%
comments
<1 min 31%
7%
responses
Median time
1:30
4:06
15
Quick Summary of Tips
• Mechanical Turk offers the practitioner a way to access a
large user pool and quickly collect data at low cost
• Good results require careful task design
1. Use verifiable questions to signal monitoring
2. Make malicious answers as high cost as good-faith answers
3. Make verifiable answers useful for completing task
4. Put verifiable tasks before subjective responses
16
Generalizing to other MTurk studies
• Combine objective and subjective questions
– Rapid prototyping: ask verifiable questions about content/
design of prototype before subjective evaluation
– User surveys: ask common-knowledge questions before
asking for opinions
• Filtering for Quality
– Put in a field for Free-Form Responses and Filter out
data without answers
– Results that came in too quickly
– Sort by WorkerID and look for cut and paste answers
– Look for outliers in the data that are suspicious
17
Talk in 3 Acts
• Act 1:
– How we almost failed?!
• Act II:
– Applying MTurk to visualization evaluation
• Act III:
– Where are the limits?
18
What is Wikipedia?
Wikipedia is the best thing ever. Anyone in the world can write
anything they want about any subject, so you know you re getting the
best possible information.
– Steve Carell, The Office
21
What would make you trust Wikipedia more?
Wikipedia, just by its nature, is
impossible to trust completely. I don't
think this can necessarily be
changed.
23
WikiDashboard
Transparency of social dynamics can reduce conflict and coordination
issues
Attribution encourages contribution
– WikiDashboard: Social dashboard for wikis
– Prototype system: http://wikidashboard.parc.com
Visualization for every wiki page
showing edit history timeline and
top individual editors
Can drill down into activity history
for specific editors and view edits
to see changes side-by-side
Citation: Suh et al.
CHI 2008 Proceedings
2011 UCBerkeley Visual Computing Retreat 24
Top
Editor
-‐
Wasted
Time
R
2011 UCBerkeley Visual
26
Computing Retreat
Surfacing information
• Numerous studies mining Wikipedia revision
history to surface trust-relevant information
– Adler & Alfaro, 2007; Dondio et al., 2006; Kittur et al., 2007;
Viegas et al., 2004; Zeng et al., 2006
Suh, Chi, Kittur, & Pendleton, CHI2008
• But how much impact can this have on user
perceptions in a system which is inherently
mutable?
27
Hypotheses
1. Visualization will impact perceptions of trust
2. Compared to baseline, visualization will
impact trust both positively and negatively
3. Visualization should have most impact when
high uncertainty about article
• Low quality
• High controversy
28
Design
• 3 x 2 x 2 design
Controversial Uncontroversial
Visualization Abortion Volcano
High quality
• High stability George Bush Shark
• Low stability
• Baseline (none) Pro-life feminism Disk
defragmenter Low quality
Scientology and
celebrities Beeswax
29
Method
• Users recruited via Amazon s Mechanical Turk
– 253 participants
– 673 ratings
– 7 cents per rating
– Kittur, Chi, & Suh, CHI 2008: Crowdsourcing user studies
• To ensure salience and valid answers, participants
answered:
– In what time period was this article the least stable?
– How stable has this article been for the last month?
– Who was the last editor?
– How trustworthy do you consider the above editor?
36
Results
7 High stability Baseline Low stability
6
Trustworthiness rating
5
4
3
2
1
Low qual High qual Low qual High qual
Uncontroversial Controversial
main effects of quality and controversy:
• high-quality articles > low-quality articles (F(1, 425) = 25.37, p < .001)
• uncontroversial articles > controversial articles (F(1, 425) = 4.69, p = .
031)
37
Results
7 High stability Baseline Low stability
6
Trustworthiness rating
5
4
3
2
1
Low qual High qual Low qual High qual
Uncontroversial Controversial
interaction effects of quality and controversy:
• high quality articles were rated equally trustworthy whether controversial
or not, while
• low quality articles were rated lower when they were controversial than
when they were uncontroversial.
38
Results
1. Significant effect of visualization: High-Stability > Low-Stability, p < .001
2. Viz has both positive and negative effects:
– High-Stability > Baseline (p < .001) > Low-Stability, p < .01
3. No interaction of visualization with either quality or controversy
– Robust across visualization conditions
7 High stability Baseline Low stability
6
Trustworthiness rating
5
4
3
2
1
Low qual High qual Low qual High qual
Uncontroversial Controversial
39
Results
1. Significant effect of visualization: High-Stability > Low-Stability, p < .001
2. Viz has both positive and negative effects:
– High-Stability > Baseline (p < .001) > Low-Stability, p < .01
3. No interaction of visualization with either quality or controversy
– Robust across visualization conditions
7 High stability Baseline Low stability
6
Trustworthiness rating
5
4
3
2
1
Low qual High qual Low qual High qual
Uncontroversial Controversial
40
Results
1. Significant effect of visualization: High-Stability > Low-Stability, p < .001
2. Viz has both positive and negative effects:
– High-Stability > Baseline (p < .001) > Low-Stability, p < .01
3. No interaction of visualization with either quality or controversy
– Robust across visualization conditions
7 High stability Baseline Low stability
6
Trustworthiness rating
5
4
3
2
1
Low qual High qual Low qual High qual
Uncontroversial Controversial
41
Talk in 3 Acts
• Act 1:
– How we almost failed?!
• Act II:
– Applying MTurk to visualization evaluation
• Act III:
– Where are the limits?
42
Limitations of Mechanical Turk
• No control of users environment
– Potential for different browsers, physical distractions
– General problem with online experimentation
• Not yet designed for user studies
– Difficult to do between-subjects design
– May need some programming
• Hard to control user population
– hard to control demographics, expertise
43
Crowdsourcing for HCI Research
• Does my interface/visualization work?
– WikiDashboard: transparency vis for Wikipedia [Suh et al.]
– Replicating Perceptual Experiments [Heer et al., CHI2010]
• Coding of large amount of user data
– What is a Question in Twitter? [Sharoda Paul, Lichan Hong, Ed Chi]
• Incentive mechanisms
– Intrinsic vs. Extrinsic rewards: Games vs. Pay
– [Horton & Chilton, 2010 for Mturk] and [Ariely, 2009] in general
44
Crowdsourcing for HCI Research
• Does my interface/visualization work?
– WikiDashboard: transparency vis for Wikipedia [Suh et al. VAST,
Kittur et al. CSCW2008]
– Replicating Perceptual Experiments [Heer et al., CHI2010]
• Coding of large amount of user data
– What is a Question in Twitter? [S. Paul, L. Hong, E. Chi, ICWSM 2011]
• Incentive mechanisms
– Intrinsic vs. Extrinsic rewards: Games vs. Pay
– [Horton & Chilton, 2010 on MTurk] and Satisficing
– [Ariely, 2009] in general: Higher pay != Better work
45
Managing Quality
• Quality through redundancy: Combining votes
– Majority vote [work best when similar worker quality]
– Worker-Quality‐adjusted vote
– Managing dependencies
• Quality through gold data
– Advantaged when imbalanced dataset & bad workers
• Estimating worker quality (Redundancy + Gold)
– Calculate the confusion matrix and see if you actually
get some information from the worker
• Toolkit: http://code.google.com/p/get‐another‐label/
Source: Ipeirotis, WWW2011 46
Coding and Machine Learning
!"#$%& '(%)*"(+
• Integration with Machine Learning
• ,)#-+' %-.&% */-"+"+0 1-*- using
– Build automatic classification models
crowdsourced data
• 2'& */-"+"+0 1-*- *( .)"%1 #(1&%
Data from existing
crowdsourced answers
N
New C
Case Automatic Model Automatic
(through machine learning) Answer
Source: Ipeirotis, WWW2011
47
Crowd Programming for Complex Tasks
• Decompose tasks into smaller tasks
– Digital Taylorism
– Frederick Winslow Taylor (1856-1915)
– 1911 'Principles Of Scientific Management’
• Crowd Programming Explorations
– MapReduce Models
• Kittur, A.; Smus, B.; and Kraut, R. CHI2011EA on CrowdForge.
• Kulkarni, Can, Hartmann, CHI2011 workshop & WIP
– Little, G.; Chilton, L.; Goldman, M.; and Miller, R. C. In
KDD 2010 Workshop on Human Computation.
48
CHI 2011 • Work-in-Progress May 7–12, 2011 • Vancouver, BC, Canada
Crowd Programming for Complex Tasks
!
!
"#!$%&!'%()(*!%!(&+,-.-+/!&01,+((-#2!('+&!-(!%&&3-+/!'1! &%0'-'-1#!('+&!%()+/!:10)+0(!'1!,0+%'+!%#!%0'-,3+!18'3-#+*!
+%,4!-'+$!-#!'4+!&%0'-'-1#5!64+(+!'%()(!%0+!-/+%337! 0+&0+(+#'+/!%(!%#!%00%7!1.!(+,'-1#!4+%/-#2(!(8,4!%(!
• Crowd Programming Explorations
(-$&3+!+#1824!'1!9+!%#(:+0%93+!97!%!(-#23+!:10)+0!-#!%!
(410'!%$18#'!1.!'-$+5!;10!+<%$&3+*!%!$%&!'%()!.10!
EF-('107G!%#/!EH+120%&47G5!"#!%#!+#=-01#$+#'!:4+0+!
:10)+0(!:183/!,1$&3+'+!4-24!+..10'!'%()(*!'4+!#+<'!('+&!
– Kittur, A.; Smus, B.; and Kraut, R. CHI2011EA on
%0'-,3+!:0-'-#2!,183/!%()!%!:10)+0!'1!,133+,'!1#+!.%,'!1#!
%!2-=+#!'1&-,!-#!'4+!%0'-,3+>(!18'3-#+5!?83'-&3+!-#('%#,+(!
$-24'!9+!'1!4%=+!(1$+1#+!:0-'+!%!&%0%20%&4!.10!+%,4!
(+,'-1#5!F1:+=+0*!'4+!/-..-,83'7!%#/!'-$+!-#=13=+/!-#!
CrowdForge.
1.!%!$%&!'%()(!,183/!9+!-#('%#'-%'+/!.10!+%,4!&%0'-'-1#@!
+525*!$83'-&3+!:10)+0(!,183/!9+!%()+/!'1!,133+,'!1#+!.%,'!
.-#/-#2!'4+!-#.10$%'-1#!.10!%#/!:0-'-#2!%!,1$&3+'+!
&%0%20%&4!.10!%!4+%/-#2!-(!%!$-($%',4!'1!'4+!31:!:10)!
+%,4!1#!%!'1&-,!-#!&%0%33+35! ,%&%,-'7!1.!$-,01I'%()!$%0)+'(5!648(!:+!901)+!'4+!'%()!
– Kulkarni, Can, Hartmann, CHI2011 workshop & WIP
8&!.80'4+0*!(+&%0%'-#2!'4+!-#.10$%'-1#!,133+,'-1#!%#/!
;-#%337*!0+/8,+!'%()(!'%)+!%33!'4+!0+(83'(!.01$!%!2-=+#! :0-'-#2!(89'%()(5!B&+,-.-,%337*!+%,4!(+,'-1#!4+%/-#2!
$%&!'%()!%#/!,1#(13-/%'+!'4+$*!'7&-,%337!-#'1!%!(-#23+! .01$!'4+!&%0'-'-1#!:%(!8(+/!'1!2+#+0%'+!$%&!'%()(!-#!
0+(83'5!"#!'4+!%0'-,3+!:0-'-#2!+<%$&3+*!%!0+/8,+!('+&!
$-24'!'%)+!.%,'(!,133+,'+/!.10!%!2-=+#!'1&-,!97!$%#7!
:10)+0(!%#/!4%=+!%!:10)+0!'80#!'4+$!-#'1!%!&%0%20%&45! “Please solve the 16-question SAT located at
A#7!1.!'4+(+!('+&(!,%#!9+!-'+0%'-=+5!;10!+<%$&3+*!'4+! http://bit.ly/SATexam”. In both cases, we paid workers
'1&-,!.10!%#!%0'-,3+!(+,'-1#!/+.-#+/!-#!%!.-0('!&%0'-'-1#!
between $0.10 and $0.40 per HIT. Each “subdivide” or
,%#!-'(+3.!9+!&%0'-'-1#+/!-#'1!(89(+,'-1#(5!B-$-3%037*!'4+!
&%0%20%&4(!0+'80#+/!.01$!1#+!0+/8,'-1#!('+&!,%#!-#! “merge” HIT received answers within 4 hours; solutions
'80#!9+!0+10/+0+/!'401824!%!(+,1#/!0+/8,'-1#!('+&5!
to the initial task were complete within 72 hours.
!"#$%#&'()$#%
C+!+<&310+/!%(!%!,%(+!('8/7!'4+!,1$&3+<!'%()!1.!
:0-'-#2!%#!+#,7,31&+/-%!%0'-,3+5!C0-'-#2!%#!%0'-,3+!-(!%! Results
,4%33+#2-#2!%#/!-#'+0/+&+#/+#'!'%()!'4%'!-#=13=+(!$%#7! The decompositions produced by Turkers while running
/-..+0+#'!(89'%()(D!&3%##-#2!'4+!(,1&+!1.!'4+!%0'-,3+*!
41:!-'!(4183/!9+!('08,'80+/*!.-#/-#2!%#/!.-3'+0-#2! Turkomatic are displayed in Figure 1 (essay-writing)
-#.10$%'-1#!'1!-#,38/+*!:0-'-#2!8&!'4%'!-#.10$%'-1#*!
.-#/-#2!%#/!.-<-#2!20%$$%0!%#/!(&+33-#2*!%#/!$%)-#2!
and Figure 4 (SAT).
'4+!%0'-,3+!,14+0+#'5!64+(+!,4%0%,'+0-('-,(!$%)+!%0'-,3+!
Figure 4. For the SAT task, we uploaded
:0-'-#2!%!,4%33+#2-#2!98'!0+&0+(+#'%'-=+!'+('!,%(+!.10!
sixteen questions from a high school
180!%&&01%,45! In the essay task, each “subdivide” HIT was posted
Scholastic Aptitude Test to the web and three times by Turkomatic and the best of the three
61!(13=+!'4-(!&0193+$!:+!,0+%'+/!%!(-$&3+!.31:! *)+',$%-.%/",&)"0%,$#'0&#%12%"%3100"41,"&)5$%
was selected by experimenters (simulating Turker 49
posed ,1#(-('-#2!1.!%!&%0'-'-1#*!$%&*!%#/!0+/8,+!('+&5!!64+!
the following task to Turkomatic: 6,)&)7+%&"#89%
“Please solve the 16-question SAT located voting) to continue the solution process. The proposed
at http://bit.ly/SATexam”. decompositions were overwhelmingly linear and chose
1804
Future Directions in Crowdsourcing
• Real-time Crowdsourcing
– Bigham, et al. VizWiz, UIST 2010
What color is this pillow? What denomination is Do you see picnic tables What temperature is my Can you please tell me What k
this bill? across the parking lot? oven set to? what this can is? thi
(89s) . (24s) 20 (13s) no (69s) it looks like 425 (183s) chickpeas. (91s)
(105s) multiple shades (29s) 20 (46s) no degrees but the image (514s) beans (99s) n
of soft green, blue and is difficult to see. (552s) Goya Beans picture
gold (84s) 400 (247s)
(122s) 450
Figure 2: Six questions asked by participants, the photographs they took, and answers received with latency in s
50
the total time required to answer a question. quikTurkit also distribution was set such that half of the HI
Future Directions in Crowdsourcing
• Real-time Crowdsourcing
– Bigham, et al. VizWiz, UIST 2010
• Embedding of Crowdwork inside Tools
– Bernstein, et al. Solyent, UIST 2010
51
Crowd Feedback
To effectively design feedback mechanism
the goals of learning, engagement, and qu
improvement, we first analyze the importa
Future Directions in Crowdsourcing
dimensions of the design space for crowd
(Figure 2).
Timeliness: When should feedback be sho
• Real-time Crowdsourcing
In micro-task work, workers stay with tas
– Bigham, et al. VizWiz, UIST 2010
while, then move on. This implies two tim
synchronously deliver feedback when wor
• Embedding of Crowdwork inside Tools
engaged in a set of tasks, or asynchronou
– Bernstein, et al. Solyent, UIST 2010
feedback after workers have completed th
• Shepherding Crowdwork
Synchronous feedback may have more im
– Dow et al. CHI2011 WIP
task performance since
while workers are still th
the task domain. It also
probability that workers
onto similar tasks. Howe
synchronous feedback p
burden on the feedback
they have little time to r
This implies a need for t
scheduling algorithms th
near real-time feedback
Asynchronous feedback
52
feedback providers more
Figure 2: Current systems (in orange) focus on
asynchronous, single-bit feedback by requesters. review and comment on
Tutorials
• Matt Lease http://ir.ischool.utexas.edu/crowd/
• AAAI 2011 (w HCOMP 2011): Human Computation: Core Research Questions
and State of the Art (E. Law & Luis von Ahn)
• WSDM 2011: Crowdsourcing 101: Putting the WSDM of Crowds to Work for
You (Omar Alonso and Matthew Lease)
– http://ir.ischool.utexas.edu/wsdm2011_tutorial.pdf
• LREC 2010 Tutorial: Statistical Models of the Annotation Process (Bob Carpenter
and Massimo Poesio)
– http://lingpipe-blog.com/2010/05/17/
• ECIR 2010: Crowdsourcing for Relevance Evaluation. (Omar Alonso)
– http://wwwcsif.cs.ucdavis.edu/~alonsoom/crowdsourcing.html
• CVPR 2010: Mechanical Turk for Computer Vision. (Alex Sorokin and Fei‐Fei Li)
– http://sites.google.com/site/turkforvision/
• CIKM 2008: Crowdsourcing for Relevance Evaluation (D. Rose)
– http://videolectures.net/cikm08_rose_cfre/
• WWW2011: Managing Crowdsourced Human Computation (Panos Ipeirotis)
– http://www.slideshare.net/ipeirotis/managing-crowdsourced-human-computation
53
Social Q&A on Twitter!
!
S.
Paul,
L.
Hong,
E.
Chi,
ICWSM
2011
3/27/12 54
Why social Q&A?!
!
!
People turn to their friends on social networks because they
trust their friends to provide tailored answers to subjective
questions on niche topics.!
!
3/27/12! 55
Research Questions!
!
What kinds of questions are Twitter users asking
their friends?!
!
Types and topics of questions!
!
Are users receiving responses to the questions
they are asking?!
Number, speed, and relevancy of responses!
!
How does the nature of the social network affect Q&A
behavior?!
Size and usage of network, reciprocity of relationship!
3/27/12 58
Identifying question tweets was challenging!
!
Advertisement framed as question!
!
!
Rhetorical question!
!
!
Missing context!
!
!
Used heuristics to identify candidate tweets! that were
possibly questions!
! 3/27/12 59
Classifying candidates tweets using
Mechanical Turk!
Crowd-sourced question tweet identification to Amazon Mechanical
Turk!
!
!
Control tweet!
!
• Each Tweet classified by two Turkers!
!
• Each Turker classified 25 tweets: 20 candidates and 5
control tweets!
• Only accepted data from Turkers who classified all
control tweets correctly!
3/27/12 60
Overall method for filtering questions!
! Candidate tweets
Random sample of public tweets
Applied heuristics to !
identify candidate tweets !
12,000! 1.2 million!
(4,100 presented to Turkers)!
Classified candidates! Tracked responses !
using Mechanical Turk! to each candidate tweet!
624!
1152!
3/27/12 61
Findings: Types and topics of questions!
!
Rhetorical (42%), factual (16%), and poll (15%) questions
were common!
Significant percentage of personal & health (11%)questions!
!
!
Question types! Question topics!
How do you feel about Which team is better
interracial dating? others
raiders or steelers?
16%
uncategorized
5%
entertainment
professional
32%
4%
restaurant/food
Any good iPad app
4%
recommendations?
current
events
In UK, when you need to 4%
gree8ngs
technology
see a specialist, do you 7%
10%
personal
need special forms or &
health
permission? 11%
ethics
&
philosophy
7%
Any idea how to lost
weight fast?
3/27/12 62
Findings: Responses to questions!
8
!
7
!
log(number of questions)
6
Number of responses
5
have a long tail Low (18.7%)
4 distribution! response rate in
3 general, but quick
2 responses!
1
0
!
0
1
2
3
4
5
6
7
8
10
16
17
28
29
39
147
Number of answers
Most often reciprocity between asker and
answerer was one-way (55%)!
Responses were largely (84%) relevant!
3/27/12
! 63
Findings: Social network characteristics!
!
Which characteristics of asker predict whether she will
receive a response?!
!
!
Network size and status in network are good
!
predictors of whether asker will receive
response!
Logistic regression modeling (structural properties)!
!
Number of followers (+) " "Number of tweets posted!
Number of days on Twitter (+)" "Frequency of use of Twitter!
Ratio of followers/followees (+)!
Reciprocity rate (-)!
!
!
!
3/27/12 64
Thanks!
• chi@acm.org
• http://edchi.net
• @edchi
• Aniket Kittur, Ed H. Chi, Bongwon Suh. Crowdsourcing User Studies
With Mechanical Turk. In Proceedings of the ACM Conference on Human-
factors in Computing Systems (CHI2008), pp.453-456. ACM Press, 2008.
Florence, Italy.
• Aniket Kittur, Bongwon Suh, Ed H. Chi. Can You Ever Trust a Wiki?
Impacting Perceived Trustworthiness in Wikipedia. In Proc. of Computer-
Supported Cooperative Work (CSCW2008), pp. 477-480. ACM Press, 2008.
San Diego, CA. [Best Note Award]
66