Insemtives swat4ls 2012


Published on

Invited talk at the SW

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Insemtives swat4ls 2012

  1. 1. Using crowdsourcing for Semantic Web applications and tools Elena Simperl, Karlsruhe Institute of Technology, Germany Talk at SWAT4LS Summer School, Aveiro, Portugal May 20126/7/2012 1
  2. 2. Semantic technologies are mainly about automation…• …but many tasks in semantic content authoring fundamentally rely on human input – Modeling a domain – Understanding text and media content (in all their forms and languages) – Integrating data sources originating from different contexts
  3. 3. Incentives and motivators• What motivates people to engage with an application?• Which rewards are effective and when?• Motivation is the driving force that makes humans achieve their goals• Incentives are ‘rewards’ assigned by an external ‘judge’ to a performer for undertaking a specific task – Common belief (among economists): incentives can be translated into a sum of money for all practical purposes• Incentives can be related to extrinsic and intrinsic motivations
  4. 4. Examples of applications 4
  5. 5. Incentives and motivators (2)• Successful volunteer crowdsourcing is difficult to predict or replicate – Highly context-specific – Not applicable to arbitrary tasks• Reward models often easier to study and control (if performance can be reliably measured) – Different models: pay-per-time, pay-per-unit, winner-takes-it- all… – Not always easy to abstract from social aspects (free-riding, social pressure…) – May undermine intrinsic motivation
  6. 6. Examples (2)Mason & Watts: Financial incentives and the performance of the crowds, HCOMP 2009.
  7. 7. Amazon‘s Mechanical Turk • Successfully applied to transcription, classification, and content generation, data collection, image tagging, website feedback, usability tests…* • Increasingly used by academia for evaluation purposes • Extensions for quality assurance, complex workflows, resource management, vertical domains…*
  8. 8. What tasks can be (microtask-) crowdsourced?• Best case – Routine work requiring common knowledge, decomposable into simpler, independent sub- tasks, performance easily measurable, no spam• Ongoing research in task design, quality assurance, estimated time of completion…• Example: open-scale tasks in MTurk – Generate, then vote. – Introduce random noise to identify potential issues in the second step Label Correct Vote answers Generate answer image or not?
  9. 9. Examples (2) 9
  10. 10. GWAPs and gamification • GWAPs: human computation disguised as casual games • Gamification/game mechanics: integrate game elements to applications* – Accelerated feedback cycles • Annual performance appraisals vs immediate feedback to maintain engagement – Clear goals and rules of play • Players feel empowered to achieve goals vs fuzzy, complex system of rules in real-world – Compelling narrative • Gamification builds a narrative that engages players to participate and achieve the goals of the activity – But in the end it’s about what tasks users want to get better at*
  11. 11. What tasks can be gamified?*• Work is decomposable into simpler tasks• Tasks are nested• Performance is measurable• One can define an obvious rewarding scheme• Skills can be arranged in a smooth learning curve *
  12. 12. What is different about semantic systems?• It‘s still about the context of the actual application• User engagement with semantic tasks to – Ensure knowledge is relevant and up-to-date – People accept the new solution and understand its benefits – Avoid cold-start problems – Optimize maintenance costs
  13. 13. What do you want your users to do?• Semantic applications – Context of the actual application – Need to involve users in knowledge engineering tasks? • Incentives are related to organizational and social factors • Seamless integration of new features• Semantic tools – Game mechanics – Paid crowdsourcing (possibly integrated with the tool)• Using results of casual games
  14. 14. Crowdsourcing knowledge engineering• Granularity of activities is typically too high• Further splitting is needed• Crowdsource very specific tasks that are (highly) divisible –Labeling (in different languages) –Finding relationships –Populating the ontology –Aligning and interlinking –Ontology-based annotation –Validating the results of automatic methods –… 14
  15. 15. Example: ontology building6/7/2012 15
  16. 16. Example: relationship finding
  17. 17. Example: video annotation 17
  18. 18. Example: ontology alignment 18
  19. 19. Example: ontology evaluation 19
  20. 20. OntoGame API• API that provides several methods that are shared by the OntoGame games, such as – Different agreement types (e.g. selection agreement) – Input matching (e.g. , majority) – Game modes (multi-player, single player) – Player reliability evaluation – Player matching (e.g., finding the optimal partner to play) – Resource (i.e., data needed for games) management – Creating semantic content• eric-gaming-toolkit6/7/2012 20
  21. 21. Lessons learned• Approach is feasible for mainstream domains, where a knowledge corpus is available• Approach is per design less applicable to Semantic Web-tasks – Knowledge-intensive tasks are not easily nestable – Repetitive tasks  players‘ retention?• Knowledge corpus has to be large-enough to allow for a rich game experience – But you need a critical mass of players to validate the results• Advertisement is essential• Game design vs useful content – Reusing well-known game paradigms – Reusing game outcomes and integration in existing workflows and tools• Cost-benefit analysis
  22. 22. General guidelines• Focus on the actual goal and incentivize related actions – Write posts, create graphics, annotate pictures, reply to customers in a given time…• Build a community around the intended actions – Reward helping each other in performing the task and interaction – Reward recruiting new contributors• Reward repeated actions – Actions become part of the daily routine
  23. 23. Games vs Mechanical Turk
  24. 24. Combining human and computational intelligence Give me the German names of all commercial airports in Baden- Württemberg, ordered by their most informative description.„Retrieve the labels in German of commercial airports locatedin Baden-Württemberg, ordered by the better human-readabledescription of the airport given in the comment“.• This query cannot be optimally answered automatically – Incorrect/missing classification of entities (e.g. classification as airports instead of commercial airports) – Missing information in data sets (e.g. German labels) – It is not possible to optimally perform subjective operations (e.g. comparisons of pictures or NL comments)
  25. 25. What tasks should be crowdsourced?„Retrieve the labels in German of commercial airportslocated in Baden-Württemberg, ordered by the betterhuman-readable description of the airport given in thecomment“. ClassificationSPARQL Query: 1SELECT ?label WHERE { ?x a metar:CommercialHubAirport; rdfs:label ?label; rdfs:comment ?comment . Identity resolution ?x geonames:parentFeature ?z . 2 ?z owl:sameAs <>. 3 Missing Information FILTER (LANG(?label) = "de")} ORDER BY CROWD(?comment, "Better description of %x") 4 Ordering
  26. 26. Crowdsourced query processing• Extensions to VoID and SPARQL• Formal, declarative description of data and tasks using SPARQL patterns as a basis for the automatic design of HITs.• Hybrid query processing (adaptive techniques, caching, semantically driven task design)
  27. 27. HITs design: Classification • It is not always possible to automatically infer classification from the properties. • Example: Retrieve the names (labels) of METAR stations that correspond to commercial airports.SELECT ?label WHERE { ?station a metar:CommercialHubAirport; rdfs:label ?label .} Input: {?station a metar:Station; rdfs:label ?label; wgs84:lat ?lat; wgs84:long ?long} Output: {?station a ?type. ?type rdfs:subClassOf metar:Station}
  28. 28. HITs design: Ordering • Orderings defined via less straightforward built-ins; for instance, the ordering of pictorial representations of entities. • SPARQL extension: ORDER BY CROWD • Example: Retrieves all airports and their pictures, and the pictures should be ordered according to the more representative image of the given airport.SELECT ?airport ?picture WHERE { ?airport a metar:Airport; foaf:depiction ?picture .} ORDER BY CROWD(?picture,"Most representative image for %airport")Input: {?airport foaf:depiction ?x, ?y}Output: {{(?x ?y) a rdf:List} UNION {(?y ?x) a rdf:List}}
  29. 29. Challenges• Appropriate level of granularity for HITs design for specific SPARQL constructs• Caching – Naively we can materialise HIT results into datasets – How to deal with partial coverage and dynamic datasets• Optimal user interfaces of graph-like content• Pricing and workers’ assignment
  30. 30. Thank you e:, t: @esimperlPublications available at Team: Maribel Acosta, Barry Norton, Katharina Siorpaes, Stefan Thaler, Stephan Wölger and many others