• Save
WebSci2013 Harnessing Disagreement in Crowdsourcing
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

WebSci2013 Harnessing Disagreement in Crowdsourcing

on

  • 2,029 views

 

Statistics

Views

Total Views
2,029
Views on SlideShare
993
Embed Views
1,036

Actions

Likes
1
Downloads
0
Comments
0

9 Embeds 1,036

http://lora-aroyo.net 614
http://wm.cs.vu.nl 317
http://loraaroyo.wordpress.com 98
http://www.google.com 2
http://lora-aroyo.org 1
http://www.slimbedrijvenvinden.nl&_=1385110417304 HTTP 1
http://www.slimbedrijvenvinden.nl&_=1385110569235 HTTP 1
https://www.google.com 1
https://loraaroyo.wordpress.com 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

WebSci2013 Harnessing Disagreement in Crowdsourcing Presentation Transcript

  • 1. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyogathering gold standard annotations for relation extraction Crowd TruthHarnessing Disagreement inCrowdsourcing
  • 2. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoGold StandardAssumption• typically in cognitive systems• for each annotated instance there is a single right answer• gold standard quality can be measured in inter-annotatoragreementLet them disagree?
  • 3. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoHypothesisAnnotator disagreement is not noise, but signal.Not a problem to overcome but a source of information for machinesArtificially restricting humans does not help machines to learn.They will learn better from diversity
  • 4. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoPositiondisagreement is a sign ofintrinsic vagueness & ambiguity in human understanding
  • 5. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoApproach Principles1.  Tolerate, capture & exploit disagreement2.  Understand it by a space of possibilities (frequencies & similarities)3.  Score the machine output based on where it falls in this space4.  Adapt to new annotation tasks
  • 6. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoRelation Extractioncrowdsourcing gold standard dataRelations overlap in meaningSentences are vague and ambiguousExperts have different interpretations
  • 7. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
  • 8. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoFeeling the way the CHEST expands (PALPATION), can identify areas ofthe lung that are full of fluid.?PALPATIONIs CHEST related todiagnose location associatedwithis_a otherpart_of0 0 02 3 0 0 0 1 0 0 44 1?CONJUNCTIVITISHYPERAEMIA related toIs0 0 0 1 0 0 0 013 0 0 0 0 0symptomcauseRedness (HYPERAEMIA), irritation (chemosis) and watering (epiphora)of the eyes are symptoms common to all forms of CONJUNCTIVITIS.
  • 9. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
  • 10. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoHarnessing Disagreement• Sentence-relation score: core crowd truth metric for relation extraction, measured for each relation oneach sentence as the cosine of the unit vector for relation with sentence vector• Sentence clarity: for each sentence - max relation score for that sentence. If all the workers selected thesame relation for a sentence, the max score is 1, indicating a clear sentence• Relation similarity: pairwise conditional probability that if relation Ri is annotated in a sentence, Rj is aswell. Indicates how confusable the linguistic expression of two relations are• Relation ambiguity: max relation similarity for a relation. If a relation is clear it has low score• Relation clarity: max sentence-relation score for a relation over all sentences. If a relation has a highclarity score, it means that it is at least possible to express the relation clearly
  • 11. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoThe Dark Side of CrowdsourcingDisagreement• spammers generate disagreement for the wrong reasons• most spam detection requires gold standard• Worker-sentence disagreement: the average of all the cosines between eachworker’s sentence vector and the full sentence vector (minus that worker).Indicates how much a worker disagrees with the crowd on a sentence basis• Worker-worker disagreement: a pairwise confusion matrix between workersand the average agreement across the matrix for each worker. Indicateswhether there are consistently like-minded workers
  • 12. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoQuestions?