A Guide to Crowdsourced Science Platforms and Community Engagement
1. A WHOLE NEW ZOONIVERSE
GUIDELINES AND TOOLS FOR
CROWDSOURCED SCIENCE
Elena Simperl
e.simperl@soton.ac.uk
@esimperl
November 16th, 2016
1
2. OVERVIEW
• Citizen science is a fascinating subject
for Web science research
• Our work helps system designers with
• Frameworks of motivations and incentives
engineering
• Design guidelines and recommendations
• Methods to make crowdsourced tasks more
effective
• Methods to study engagement and community
health
TUTORIAL@ISWC2013
13. LEVELS OF ENGAGEMENT
0
2
4
6
8
10
12
14
16
18
20
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Activeusersin%
Month since registration
~1% of participants contribute 72% of Talk
& 29% of Task
[Luczak-Roesch et al., 2014]
14. DATA QUALITY
Existing quality inference
algorithms are limited to binary
classification
Conceptualise the problem for
realistic workflows
Develop efficient
implementations of algorithms
Compare them
14
15. DATA QUALITY (2)
Majority Voting
Find the annotation with the top vote as
the true label for each object
Message Passing
Use object-specific worker messages to
represent how reliable a worker is in
labelling each specific object.
Expectation Maximization
Infer the true label for each object, using
annotations from all users, accounting
for the error rates of each user;
Estimates the error rates of each user by
comparing their annotations with inferred
true labels.
Results measured in terms of
different accuracy metrics and
time
Experiments still ongoing
Preview
Majority voting performs exceptionally
well for large numbers of annotations
If less data is available, one could explore
message passing (possibly in
combination with majority voting)
15
17. SOCIALITY
Discussions and engagement
with volunteers are integral part
of the experience
Leads to serendipitous scientific
discoveries
Encourages autonomy and helps
with community building
17
19. CHAT AND INSTANT MESSAGING
Microposts
PH SG SW NN GZ CC PF SF AP WS
91%
2
0
6
4
10
8
[Luczak-Roesch et al., 2014; Tinati et al., 2015,
20. DISCUSSION PROFILES
Deeply
engaged
volunteers,
few threads
but multiple
posts within
them
9 0.
1%
Content
producers,
posting
across many
boards and
threads
7
0.
1%
Thread
followers
and PM
(one-to-one)
talkers
8 0.4
%
First to
respond and
question
answerers
4 1%
Highly active
thread starters
and answerers
across a wide
range of topics
1 2.
8%
Infrequent
volunteers,
single thread
posts, no
personal
messages
5 5.5
%
Watcher and
starter of
many threads,
but not first
to reply
3
6.
5%
Highly active
thread
starters and
first to reply
back
2 14.
6%
Long active
volunteers (the
core group),
posting
sporadically
6 69.0
%
[Tinati et al., 2015, WebSci]
21. FROM
CROWD TO
COMMUNITY
Survey of 48 projects and 150
publications
Identifying affordances from
online community themes
within literature
Task visibility
Goals
Feedback
Rewards
Community features found to
have greater role than
previously considered
Encourage task completion,
discussions etc.
Themes align to key success
factors of volunteer
engagement, task completion
and submission accuracy 21[Reeves et al., 2017]
22. FROM PROJECTS TO ECOSYSTEMS
Project A
Project B
Project C
Participant X
Part. Y
[Luczak-Roesch et al., 2014]
23. DESIGNING PLATFORMS
Task
specificity
Community
development
Task design
PR and
engagement
Bootstrapping the
community
Serendipitous scientific
discovery
Engaging with people,
supporting profession team
Supporting individuals,
finding new scientific
discoveries
Obtaining new citizen
scientists
Retaining people
Supporting people,
improving task
completion
Obtaining new citizen
scientists
Reinvigorating old users
[Tinati et al., 2015, CHI]
24. WHAT’S NEXT?
Human computation & crowdsourcing
Task assignment: what tasks are interesting/relevant for whom?
Data quality: scalable and in real-time
Peer review, collaborative approaches
The role of gamification: is science a game?
Online community
Making discussions more effective
Science
Citizen science platforms that everyone can use
New forms of publishing, citation, reproducibility, and replication