Measuring the Effectiveness of Gamesourcing Expert Oil Painting Annotations
1. Myriam C. Traub, Jacco van Ossenbruggen, Jiyin He, Lynda Hardman
Centrum Wiskunde & Informatica
Measuring the Effectiveness of
Gamesourcing Expert Oil Painting Annotations
2. Research Problem
• Expert tasks are hard to
crowdsource
• too complex for non-
experts
• experts difficult to target
and engage
• Approach:
• simplify the task by
suggesting potential
answers
• help non-experts to learn by
providing expert feedback
2
3. Chosen Task
Subject type annotation of
paintings
• about 100 different
subject types in Art &
Architecture Thesaurus
• subtle differences
Marine?
Seascape?
History painting?
Kacho?3
5. Research Questions
Non-Experts:
• How well do they compare with experts, both individually and as a crowd?
• Do they memorize the correct subject type?
• Can they generalize what they have learned to new paintings?
Task:
• How does the presence or absence of a correct answer influence a user's
performance?
Data:
• Are there features of images or subject types that can predict high or low
agreement?
Game / UI / Backend system
• None
6. Experimental Setup - Data
• Subset of Steve Tagger data set: 125 images of paintings with …
• … subject type annotations by 4 experts from Rijksmuseum
Amsterdam: 168 expert annotations
• Limitation: only one expert per painting
6
7. Experimental Setup - Query Image Selection
• first 10: random, no repetition
• after 10: random, but repetition probability of 50%
8. Experimental Setup - Candidate Selection
• total 5 candidates + „other“:
• Perfect: 1 correct;
Imperfect: 1 correct (75%) or a related but incorrect (25%),
• at most 3 related but incorrect,
• at least 1 incorrect
8
22. Learning Generalizing
22
sequence number of new images
percentageofcorrectannotations
perfect %
imperfect %
[1,20] (60,80] (120,140] (200,220] (280,300] (360,380]
020406080100
23. Conclusions
Agreement between experts
and non-experts:
• substantial in perfect,
moderate in imperfect
condition
• aggregation reduces
deviations
Strong disagreement may
indicate:
• need for additional
metadata
• incorrect or incomplete
expert judgements
Learning:
• users memorize and
generalize
• users need a training phase
on high quality data
!
Results in line with
He, J., van Ossenbruggen,
J., de Vries, A.P.: Do you
need experts in the
crowd?: a case study in
image annotation for
marine biology. OAIR 2013
24. Future Work
- scarce
+ high quality data - need training data
+ high quantity data
- lack expertise
+ high quality when
trained & assisted
24
25. Thank you
for your attention!
Myriam C. Traub
myriam.traub@cwi.nl
!
!
http://sealincmedia.project.cwi.nl/artgame/
On the beach of Trouville
Eugène Boudin