Your SlideShare is downloading. ×
0
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
WebSci2013 Harnessing Disagreement in Crowdsourcing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

WebSci2013 Harnessing Disagreement in Crowdsourcing

2,219

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,219
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyogathering gold standard annotations for relation extraction Crowd TruthHarnessing Disagreement inCrowdsourcing
  • 2. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoGold StandardAssumption• typically in cognitive systems• for each annotated instance there is a single right answer• gold standard quality can be measured in inter-annotatoragreementLet them disagree?
  • 3. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoHypothesisAnnotator disagreement is not noise, but signal.Not a problem to overcome but a source of information for machinesArtificially restricting humans does not help machines to learn.They will learn better from diversity
  • 4. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoPositiondisagreement is a sign ofintrinsic vagueness & ambiguity in human understanding
  • 5. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoApproach Principles1.  Tolerate, capture & exploit disagreement2.  Understand it by a space of possibilities (frequencies & similarities)3.  Score the machine output based on where it falls in this space4.  Adapt to new annotation tasks
  • 6. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoRelation Extractioncrowdsourcing gold standard dataRelations overlap in meaningSentences are vague and ambiguousExperts have different interpretations
  • 7. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
  • 8. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoFeeling the way the CHEST expands (PALPATION), can identify areas ofthe lung that are full of fluid.?PALPATIONIs CHEST related todiagnose location associatedwithis_a otherpart_of0 0 02 3 0 0 0 1 0 0 44 1?CONJUNCTIVITISHYPERAEMIA related toIs0 0 0 1 0 0 0 013 0 0 0 0 0symptomcauseRedness (HYPERAEMIA), irritation (chemosis) and watering (epiphora)of the eyes are symptoms common to all forms of CONJUNCTIVITIS.
  • 9. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
  • 10. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoHarnessing Disagreement• Sentence-relation score: core crowd truth metric for relation extraction, measured for each relation oneach sentence as the cosine of the unit vector for relation with sentence vector• Sentence clarity: for each sentence - max relation score for that sentence. If all the workers selected thesame relation for a sentence, the max score is 1, indicating a clear sentence• Relation similarity: pairwise conditional probability that if relation Ri is annotated in a sentence, Rj is aswell. Indicates how confusable the linguistic expression of two relations are• Relation ambiguity: max relation similarity for a relation. If a relation is clear it has low score• Relation clarity: max sentence-relation score for a relation over all sentences. If a relation has a highclarity score, it means that it is at least possible to express the relation clearly
  • 11. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoThe Dark Side of CrowdsourcingDisagreement• spammers generate disagreement for the wrong reasons• most spam detection requires gold standard• Worker-sentence disagreement: the average of all the cosines between eachworker’s sentence vector and the full sentence vector (minus that worker).Indicates how much a worker disagrees with the crowd on a sentence basis• Worker-worker disagreement: a pairwise confusion matrix between workersand the average agreement across the matrix for each worker. Indicateswhether there are consistently like-minded workers
  • 12. Chris Welty Crowd Truth for Cognitive Computing Lora AroyoQuestions?

×