Crowdsourcing is for the tail
Gianluca Demartini
eXascale Infolab
University of Fribourg, Switzerland
gianlucademartini.ne...
Crowdsourced Data Curation
• Enforce quality and coverage in KBs
• To curate tail entity structured representation
• Lever...
The long tail of entity popularity
Tail Entities
• Local restaurants
• Niches sport domains (chess, cricket)
• Emerging music bands
• Rare diseases
Improving Crowdsourcing
Platforms
Gianluca Demartini 7
Push Crowdsourcing
• Pick-A-Crowd: A system architecture that uses
Task-to-Worker matching:
– The worker’s social profile
...
Pick-A-Crowd
9
Discussion
• Task-to-Worker recommendation /
Matchmaking
• Experimental comparison with AMT shows a
consistent quality imp...
OpenTurk
• Yet another a platform? Build on top of Mturk!
• Chrome Extension for push / notification
• 400+ users
• http:/...
Transactive Search
Transactive Search
• Transactive Memories
• Transactive Search:
– Memory reconstructed by a group of people
– Need to targ...
Transactive Search
• Machines: Harvest the Web + Data Mining
• Crowd: Search twitter, look at event pictures
• Transactive...
Who attended ISWC 2013?
Gianluca Demartini 15
Conclusions
• Crowdsourcing For Tail Entities
• Focusing on the difficult part of the KB
– The tail is long!
• Challenges
...
Crowdsourcing is for the tail
Crowdsourcing is for the tail
Upcoming SlideShare
Loading in …5
×

Crowdsourcing is for the tail

317 views
233 views

Published on

Talk given at the Dagstuhl Seminar 14282 "Crowdsourcing and the Semantic Web"

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
317
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Crowdsourcing is for the tail

  1. 1. Crowdsourcing is for the tail Gianluca Demartini eXascale Infolab University of Fribourg, Switzerland gianlucademartini.net exascale.info
  2. 2. Crowdsourced Data Curation • Enforce quality and coverage in KBs • To curate tail entity structured representation • Leveraging the diversity of the crowd • Targeted Crowdsourcing
  3. 3. The long tail of entity popularity
  4. 4. Tail Entities • Local restaurants • Niches sport domains (chess, cricket) • Emerging music bands • Rare diseases
  5. 5. Improving Crowdsourcing Platforms Gianluca Demartini 7
  6. 6. Push Crowdsourcing • Pick-A-Crowd: A system architecture that uses Task-to-Worker matching: – The worker’s social profile – The task context • Workers can provide higher quality answers on tasks they relate to 8 Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do. In: 22nd International Conference on World Wide Web (WWW 2013), Rio de Janeiro, Brazil, May 2013.
  7. 7. Pick-A-Crowd 9
  8. 8. Discussion • Task-to-Worker recommendation / Matchmaking • Experimental comparison with AMT shows a consistent quality improvement “Workers Know what they Like” 10
  9. 9. OpenTurk • Yet another a platform? Build on top of Mturk! • Chrome Extension for push / notification • 400+ users • http://bit.ly/openturk-extension • Open source: https://github.com/openturk/extension Gianluca Demartini 11
  10. 10. Transactive Search
  11. 11. Transactive Search • Transactive Memories • Transactive Search: – Memory reconstructed by a group of people – Need to target the right people – A form Targeted Crowdsourcing • “Who attended the ISWC 2013 conference?”
  12. 12. Transactive Search • Machines: Harvest the Web + Data Mining • Crowd: Search twitter, look at event pictures • Transactive Memories: Remember who I met Gianluca Demartini 14 Michele Catasta, Alberto Tonon, Djellel Eddine Difallah, Gianluca Demartini, Karl Aberer, and Philippe Cudré-Mauroux. Hippocampus: Answering Memory Queries using Transactive Search. In: 23rd International Conference on World Wide Web (WWW 2014), Web Science Track. Seoul, South Korea, April 2014.
  13. 13. Who attended ISWC 2013? Gianluca Demartini 15
  14. 14. Conclusions • Crowdsourcing For Tail Entities • Focusing on the difficult part of the KB – The tail is long! • Challenges – Which tail entities are valuable? – Who is the right worker? – Focus on passion rather than monetary incentives

×