SlideShare a Scribd company logo
1 of 32
Download to read offline
Crowdsourcing Ambiguity-Aware
Ground Truth
Chris Welty, Anca Dumitrache, Oana Inel,
Benjamin Timmermans, Lora Aroyo
June 15th, 2017
Collective Intelligence Conference
• rather than accepting
disagreement as a natural
property of semantic
interpretation
• traditionally, disagreement is
considered a measure of poor
quality in the annotation task
because:
– task is poorly defined or
– annotators lack training
This makes the elimination of
disagreement a goal
What if it is GOOD?
Crowdsourcing
Myth: Disagreement
is Bad
• typically annotators are asked whether a
binary property holds for each example
• often not given a chance to say that the
property may partially hold, or holds
but is not clearly expressed
• mathematics of using ground truth
treats every example the same – either
match correct result or not
• poor quality examples tend to generate
high disagreement
• disagreement allows us to weight
sentences, giving us the ability to both
train and evaluate a machine in a more
flexible way
Crowdsourcing
Myth: All Examples
Are Created Equal
What if they are DIFFERENT?
Related Work on Annotating Ambiguity
http://CrowdTruth.org
● Juergens (2013): For word-sense disambiguation, the crowd with
ambiguity modeling was able to achieve expert-level quality of
annotations.
● Cheatham et al. (2014): Current benchmarks in ontology alignment
and evaluation are not designed to model uncertainty caused by
disagreement between annotators, both expert and crowd.
● Plank et al. (2014): For part-of-speech tagging, most inter-annotator
disagreement pointed to debatable cases in linguistic theory.
● Chang et al. (2017): In a workflow of tasks for collecting and
correcting labels for text and images, found that many ambiguous
cases cannot be resolved by better annotation guidelines or through
worker quality control.
• Annotator disagreement
is signal, not noise.
• It is indicative of the
variation in human
semantic interpretation of
signs
• It can indicate ambiguity,
vagueness, similarity,
over-generality, etc,
as well as quality
CrowdTruth
http://CrowdTruth.org
Medical Relation Extraction
1. workers select the medical relations that
hold between the given 2 terms in a
sentence
2. closed task - workers pick from given set of
14 top UMLS relations
3. 975 sentences from PubMed abstracts +
distant supervision for term pair extraction
4. 15 workers /sentence
Twitter Event Extraction
1. workers select the events expressed
in the tweet
2. closed task - workers pick from set of
8 events
3. 2,019 English tweets, crawled based
on event hashtags
4. 7 workers /tweet
(*) no expert annotators
News Events Identification
1. workers highlight words/phrases that
describe an event in the sentence
2. open-ended task - workers can pick any
words
3. 200 sentences from English TimeBank
corpus
4. 15 workers /sentence
Sound Interpretation
1. workers give tags that describe a
sound
2. open-ended task - workers can pick
any tag
3. 284 sounds from Freesound
database
4. 10 workers /sentence
http://CrowdTruth.org
Medical Relation Extraction
http://CrowdTruth.org
Patients with ACUTE FEVER and nausea could be suffering from
INFLUENZA AH1N1
Is ACUTE FEVER – related to → INFLUENZA AH1N1?
Twitter Event Extraction
http://CrowdTruth.org
News Event Identification
http://CrowdTruth.org
Sound Interpretation
http://CrowdTruth.org
Medical Relation Extraction
1. workers select the medical relations that
hold between the given 2 terms in a
sentence
2. closed task - workers pick from given set of
14 top UMLS relations
3. 975 sentences from PubMed abstracts +
distant supervision for term pair extraction
4. 15 workers /sentence
Twitter Event Extraction
1. workers select the events expressed
in the tweet
2. closed task - workers pick from set of
8 events
3. 2,019 English tweets, crawled based
on event hashtags
4. 7 workers /tweet
(*) no expert annotators
News Events Identification
1. workers highlight words/phrases that
describe an event in the sentence
2. open-ended task - workers can pick any
words
3. 200 sentences from English TimeBank
corpus
4. 15 workers /sentence
Sound Interpretation
1. workers give tags that describe a
sound
2. open-ended task - workers can pick
any tag
3. 284 sounds from Freesound
database
4. 10 workers /sentence
http://CrowdTruth.org
1 1 1
Worker Vector - Closed Task
http://CrowdTruth.org
Medical Relation Extraction
1 1 1
1 1
1
1 1
1 1
1 1
1
1
1
0 1 1 0 0 4 3 0 0 5 1 0
Media Unit Vector - Closed Task
http://CrowdTruth.org
Unclear relationship between the two arguments reflected
in the disagreement
Medical Relation Extraction
http://CrowdTruth.org
Clearly expressed relation between the two arguments reflected in
the agreement
Medical Relation Extraction
http://CrowdTruth.org
Twitter Event Extraction
http://CrowdTruth.org
The tension building between China, Japan, U.S., and Vietnam is
reaching new heights this week - let's stay tuned! #APAC
#powerstruggle
Which of the following EVENTS can you identify in the tweet?
4 0 0 0 6 0 0 0 2
islands disputed
between China
and Japan
anti China
protests in
Vietnam
None
Unclear description of events in tweet
reflected in the disagreement
Twitter Event Extraction
http://CrowdTruth.org
RT @jc_stubbs: Another tragic day in #Ukraine - More than 50
rebels killed as new leader unleashes assault:
http://t.co/wcfU3kyAFX
Which of the following EVENTS can you identify in the tweet?
0 6 0 0 0 0 0 0 1
Ukraine
crisis 2014
None
Clear description of events in tweet
reflected in the agreement
Media Unit Vector - Open Task
http://CrowdTruth.org
Worker 1: balloon exploding
Worker 2: gun
Worker 3: gunshot
Worker 4: loud noise, pop
Worker 5: gun, balloon
2 3 1 1
balloon
gun,gunshot
loudnoise
pop
+ clustering
(e.g. word2vec)
Sound Interpretation
Multiple interpretations of the sound
reflected in the disagreement
Sound Interpretation
http://CrowdTruth.org
Worker 1: siren
Worker 2: police siren
Worker 3: siren, alarm
Worker 4: siren
Worker 5: annoying
4 1 1
siren
alarm
annoying
+ clustering
(e.g. word2vec)
Clear meaning of the sound
reflected in the agreement
News Event Identification
http://CrowdTruth.org
Other than usage in business, Internet technology is also beginning
to infiltrate the lifestyle domain .
HIGHLIGHT the words/phrases that refer to an EVENT.
5 4 2 2 4 5 3Internet
technology
is
also
beginning
infiltrate
lifestyle
1
business
3
usage
0
Other
3
domain
Unclear specification of events in sentence
reflected in the disagreement
Clear specification of events in sentence
reflected in the agreement
News Event Identification
http://CrowdTruth.org
Most of the city's monuments were destroyed including a
magnificent tiled mosque which dominated the skyline for centuries.
HIGHLIGHT the words/phrases that refer to an EVENT.
5 12 0 1 1 1 4 1 0
were
destroyed
including
magnificent
tiled
mosque
dominated
skyline
centuries
5
monuments
3
city
1
Most
Measures how clearly a media unit expresses an
annotation
1
0 1 1 0 0 4 3 0 0 5 1 0
Unit vector for
annotation A6
Media Unit Vector
Cosine = .55
Media Unit - Annotation Score
http://CrowdTruth.org
• Goal: what is the most accurate data
labeling method?
○ CrowdTruth: media unit-annotation
score (continuous value)
○ Majority Vote: decision of majority of
workers (discrete)
○ Single: decision of a single worker,
randomly sampled (discrete)
○ Expert: decision of domain expert
(discrete)
• Approach: evaluate with trusted
labels set with either:
○ agreement between crowd and expert, or
○ manual evaluation when disagreement
Experimental Setup
http://CrowdTruth.org
Evaluation: F1 score
Evaluation: F1 score
CrowdTruth performs better than Majority
Vote, at least as well as Expert.
Each task has different best thresholds in
the media unit - annotation score.
Evaluation: number of workers
Evaluation: number of workers
Each task reaches stable F1 for different
number of workers.
Majority Vote never beats CrowdTruth.
Sound Interpretation needs more workers.
• CrowdTruth performs just as
well as domain experts
• crowd is also cheaper
• crowd is always available
• capturing ambiguity is essential:
• majority voting discards
important signals in the data
• using only a few annotators for
ground truth is faulty
• optimal number of workers /
media unit is task dependent
Experiments
proved that:
http://CrowdTruth.org
CrowdTruth.org
umitrache et al.: Empirical Methodology for Crowdsourcing Ground
ruth. Semantic Web Journal Special Issue on Human Computation
nd Crowdsourcing (HC&C) in the Context of the Semantic Web (in
review).
Ambiguity in Crowdsourcing
http://CrowdTruth.org
Media Unit
Annotation Worker
Ambiguity in Crowdsourcing
http://CrowdTruth.org
Media Unit
Annotation Worker
Ambiguity in Crowdsourcing
http://CrowdTruth.org
Media Unit
Annotation Worker
Disagreement can indicate ambiguity!

More Related Content

What's hot

(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 Lora Aroyo
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Lora Aroyo
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignJonathan Stray
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaDiana Maynard
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisJonathan Stray
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...Cataldo Musto
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Diana Maynard
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real worldDiana Maynard
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Lora Aroyo
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsNicola Barbieri
 
Data visualisationsummit 2013
Data visualisationsummit 2013Data visualisationsummit 2013
Data visualisationsummit 2013The Pathway Group
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Ke Tao
 
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...Toronto Metropolitan University
 
Social Media Data Collection & Network Analysis with Netlytic and R
Social Media Data Collection & Network Analysis with Netlytic and R Social Media Data Collection & Network Analysis with Netlytic and R
Social Media Data Collection & Network Analysis with Netlytic and R Toronto Metropolitan University
 

What's hot (20)

(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog ...
 
Probabilistic Reasoner
Probabilistic ReasonerProbabilistic Reasoner
Probabilistic Reasoner
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)
 
Link prediction
Link predictionLink prediction
Link prediction
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanations
 
Data visualisationsummit 2013
Data visualisationsummit 2013Data visualisationsummit 2013
Data visualisationsummit 2013
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter
 
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
 
Social Media Data Collection & Network Analysis with Netlytic and R
Social Media Data Collection & Network Analysis with Netlytic and R Social Media Data Collection & Network Analysis with Netlytic and R
Social Media Data Collection & Network Analysis with Netlytic and R
 

Similar to Crowdsourcing ambiguity aware ground truth - collective intelligence 2017

#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015Lora Aroyo
 
2016 06 27 dia ibara e_source final distribution
2016 06 27 dia ibara e_source final distribution2016 06 27 dia ibara e_source final distribution
2016 06 27 dia ibara e_source final distributionMichael Ibara
 
Making Your ClaimBe specificMaking your claim specific.docx
Making Your ClaimBe specificMaking your claim specific.docxMaking Your ClaimBe specificMaking your claim specific.docx
Making Your ClaimBe specificMaking your claim specific.docxinfantsuk
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle.ai
 
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...Connie White
 
LESSON 1 introduction to media and information Literacy.pptx
LESSON 1 introduction to media and information Literacy.pptxLESSON 1 introduction to media and information Literacy.pptx
LESSON 1 introduction to media and information Literacy.pptxTeacherRen
 
005 Essay Example Personal Information Family Backgr
005 Essay Example Personal Information Family Backgr005 Essay Example Personal Information Family Backgr
005 Essay Example Personal Information Family BackgrKatie Booth
 
Generation Differences Essay. ️ Generation differences essay. What is the dif...
Generation Differences Essay. ️ Generation differences essay. What is the dif...Generation Differences Essay. ️ Generation differences essay. What is the dif...
Generation Differences Essay. ️ Generation differences essay. What is the dif...Beth Retzlaff
 
Essay Writing On Teacher
Essay Writing On TeacherEssay Writing On Teacher
Essay Writing On TeacherCynthia Wells
 
Essay On Internet. Advantages and Disadvantages of Internet Essay
Essay On Internet. Advantages and Disadvantages of Internet EssayEssay On Internet. Advantages and Disadvantages of Internet Essay
Essay On Internet. Advantages and Disadvantages of Internet EssayNaomi Davis
 
How To Write A Good Paragraph 1 Paragraph Wr
How To Write A Good Paragraph 1 Paragraph WrHow To Write A Good Paragraph 1 Paragraph Wr
How To Write A Good Paragraph 1 Paragraph WrAllison Koehn
 
Disinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with itDisinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with itnsarris
 
Ama Social Media Pres100109
Ama Social Media Pres100109Ama Social Media Pres100109
Ama Social Media Pres100109Gary Stein
 
Sensational Toefl Essay Samples Thatsnotus
Sensational Toefl Essay Samples  ThatsnotusSensational Toefl Essay Samples  Thatsnotus
Sensational Toefl Essay Samples ThatsnotusAshley Carter
 
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...Tracy Hernandez
 
Argumentative Essay Powerpoint Presentation
Argumentative Essay Powerpoint PresentationArgumentative Essay Powerpoint Presentation
Argumentative Essay Powerpoint PresentationLakeisha Johnson
 
Technology & Innovation Management Course - Session 1
Technology & Innovation Management Course - Session 1Technology & Innovation Management Course - Session 1
Technology & Innovation Management Course - Session 1Dan Toma
 

Similar to Crowdsourcing ambiguity aware ground truth - collective intelligence 2017 (20)

#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
 
2016 06 27 dia ibara e_source final distribution
2016 06 27 dia ibara e_source final distribution2016 06 27 dia ibara e_source final distribution
2016 06 27 dia ibara e_source final distribution
 
Making Your ClaimBe specificMaking your claim specific.docx
Making Your ClaimBe specificMaking your claim specific.docxMaking Your ClaimBe specificMaking your claim specific.docx
Making Your ClaimBe specificMaking your claim specific.docx
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
 
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...
A Dynamic Delphi Process Utilizing a Modified Thurstone Scaling Method: Colla...
 
LESSON 1 introduction to media and information Literacy.pptx
LESSON 1 introduction to media and information Literacy.pptxLESSON 1 introduction to media and information Literacy.pptx
LESSON 1 introduction to media and information Literacy.pptx
 
005 Essay Example Personal Information Family Backgr
005 Essay Example Personal Information Family Backgr005 Essay Example Personal Information Family Backgr
005 Essay Example Personal Information Family Backgr
 
Generation Differences Essay. ️ Generation differences essay. What is the dif...
Generation Differences Essay. ️ Generation differences essay. What is the dif...Generation Differences Essay. ️ Generation differences essay. What is the dif...
Generation Differences Essay. ️ Generation differences essay. What is the dif...
 
Essay Writing On Teacher
Essay Writing On TeacherEssay Writing On Teacher
Essay Writing On Teacher
 
An Essay About Family
An Essay About FamilyAn Essay About Family
An Essay About Family
 
Essay On Internet. Advantages and Disadvantages of Internet Essay
Essay On Internet. Advantages and Disadvantages of Internet EssayEssay On Internet. Advantages and Disadvantages of Internet Essay
Essay On Internet. Advantages and Disadvantages of Internet Essay
 
How To Write A Good Paragraph 1 Paragraph Wr
How To Write A Good Paragraph 1 Paragraph WrHow To Write A Good Paragraph 1 Paragraph Wr
How To Write A Good Paragraph 1 Paragraph Wr
 
Disinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with itDisinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with it
 
Ama Social Media Pres100109
Ama Social Media Pres100109Ama Social Media Pres100109
Ama Social Media Pres100109
 
Sensational Toefl Essay Samples Thatsnotus
Sensational Toefl Essay Samples  ThatsnotusSensational Toefl Essay Samples  Thatsnotus
Sensational Toefl Essay Samples Thatsnotus
 
presentation
presentationpresentation
presentation
 
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...
Commentary Essay Example. 10 Easy Steps: How to Write a Commentary on an Arti...
 
Argumentative Essay Powerpoint Presentation
Argumentative Essay Powerpoint PresentationArgumentative Essay Powerpoint Presentation
Argumentative Essay Powerpoint Presentation
 
Technology & Innovation Management Course - Session 1
Technology & Innovation Management Course - Session 1Technology & Innovation Management Course - Session 1
Technology & Innovation Management Course - Session 1
 

More from Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningLora Aroyo
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Lora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AILora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorLora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataLora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsLora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesLora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the LoopLora Aroyo
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 

More from Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Crowdsourcing ambiguity aware ground truth - collective intelligence 2017

  • 1. Crowdsourcing Ambiguity-Aware Ground Truth Chris Welty, Anca Dumitrache, Oana Inel, Benjamin Timmermans, Lora Aroyo June 15th, 2017 Collective Intelligence Conference
  • 2. • rather than accepting disagreement as a natural property of semantic interpretation • traditionally, disagreement is considered a measure of poor quality in the annotation task because: – task is poorly defined or – annotators lack training This makes the elimination of disagreement a goal What if it is GOOD? Crowdsourcing Myth: Disagreement is Bad
  • 3. • typically annotators are asked whether a binary property holds for each example • often not given a chance to say that the property may partially hold, or holds but is not clearly expressed • mathematics of using ground truth treats every example the same – either match correct result or not • poor quality examples tend to generate high disagreement • disagreement allows us to weight sentences, giving us the ability to both train and evaluate a machine in a more flexible way Crowdsourcing Myth: All Examples Are Created Equal What if they are DIFFERENT?
  • 4. Related Work on Annotating Ambiguity http://CrowdTruth.org ● Juergens (2013): For word-sense disambiguation, the crowd with ambiguity modeling was able to achieve expert-level quality of annotations. ● Cheatham et al. (2014): Current benchmarks in ontology alignment and evaluation are not designed to model uncertainty caused by disagreement between annotators, both expert and crowd. ● Plank et al. (2014): For part-of-speech tagging, most inter-annotator disagreement pointed to debatable cases in linguistic theory. ● Chang et al. (2017): In a workflow of tasks for collecting and correcting labels for text and images, found that many ambiguous cases cannot be resolved by better annotation guidelines or through worker quality control.
  • 5. • Annotator disagreement is signal, not noise. • It is indicative of the variation in human semantic interpretation of signs • It can indicate ambiguity, vagueness, similarity, over-generality, etc, as well as quality CrowdTruth http://CrowdTruth.org
  • 6. Medical Relation Extraction 1. workers select the medical relations that hold between the given 2 terms in a sentence 2. closed task - workers pick from given set of 14 top UMLS relations 3. 975 sentences from PubMed abstracts + distant supervision for term pair extraction 4. 15 workers /sentence Twitter Event Extraction 1. workers select the events expressed in the tweet 2. closed task - workers pick from set of 8 events 3. 2,019 English tweets, crawled based on event hashtags 4. 7 workers /tweet (*) no expert annotators News Events Identification 1. workers highlight words/phrases that describe an event in the sentence 2. open-ended task - workers can pick any words 3. 200 sentences from English TimeBank corpus 4. 15 workers /sentence Sound Interpretation 1. workers give tags that describe a sound 2. open-ended task - workers can pick any tag 3. 284 sounds from Freesound database 4. 10 workers /sentence http://CrowdTruth.org
  • 7. Medical Relation Extraction http://CrowdTruth.org Patients with ACUTE FEVER and nausea could be suffering from INFLUENZA AH1N1 Is ACUTE FEVER – related to → INFLUENZA AH1N1?
  • 11. Medical Relation Extraction 1. workers select the medical relations that hold between the given 2 terms in a sentence 2. closed task - workers pick from given set of 14 top UMLS relations 3. 975 sentences from PubMed abstracts + distant supervision for term pair extraction 4. 15 workers /sentence Twitter Event Extraction 1. workers select the events expressed in the tweet 2. closed task - workers pick from set of 8 events 3. 2,019 English tweets, crawled based on event hashtags 4. 7 workers /tweet (*) no expert annotators News Events Identification 1. workers highlight words/phrases that describe an event in the sentence 2. open-ended task - workers can pick any words 3. 200 sentences from English TimeBank corpus 4. 15 workers /sentence Sound Interpretation 1. workers give tags that describe a sound 2. open-ended task - workers can pick any tag 3. 284 sounds from Freesound database 4. 10 workers /sentence http://CrowdTruth.org
  • 12. 1 1 1 Worker Vector - Closed Task http://CrowdTruth.org Medical Relation Extraction
  • 13. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 4 3 0 0 5 1 0 Media Unit Vector - Closed Task http://CrowdTruth.org
  • 14. Unclear relationship between the two arguments reflected in the disagreement Medical Relation Extraction http://CrowdTruth.org
  • 15. Clearly expressed relation between the two arguments reflected in the agreement Medical Relation Extraction http://CrowdTruth.org
  • 16. Twitter Event Extraction http://CrowdTruth.org The tension building between China, Japan, U.S., and Vietnam is reaching new heights this week - let's stay tuned! #APAC #powerstruggle Which of the following EVENTS can you identify in the tweet? 4 0 0 0 6 0 0 0 2 islands disputed between China and Japan anti China protests in Vietnam None Unclear description of events in tweet reflected in the disagreement
  • 17. Twitter Event Extraction http://CrowdTruth.org RT @jc_stubbs: Another tragic day in #Ukraine - More than 50 rebels killed as new leader unleashes assault: http://t.co/wcfU3kyAFX Which of the following EVENTS can you identify in the tweet? 0 6 0 0 0 0 0 0 1 Ukraine crisis 2014 None Clear description of events in tweet reflected in the agreement
  • 18. Media Unit Vector - Open Task http://CrowdTruth.org Worker 1: balloon exploding Worker 2: gun Worker 3: gunshot Worker 4: loud noise, pop Worker 5: gun, balloon 2 3 1 1 balloon gun,gunshot loudnoise pop + clustering (e.g. word2vec) Sound Interpretation Multiple interpretations of the sound reflected in the disagreement
  • 19. Sound Interpretation http://CrowdTruth.org Worker 1: siren Worker 2: police siren Worker 3: siren, alarm Worker 4: siren Worker 5: annoying 4 1 1 siren alarm annoying + clustering (e.g. word2vec) Clear meaning of the sound reflected in the agreement
  • 20. News Event Identification http://CrowdTruth.org Other than usage in business, Internet technology is also beginning to infiltrate the lifestyle domain . HIGHLIGHT the words/phrases that refer to an EVENT. 5 4 2 2 4 5 3Internet technology is also beginning infiltrate lifestyle 1 business 3 usage 0 Other 3 domain Unclear specification of events in sentence reflected in the disagreement
  • 21. Clear specification of events in sentence reflected in the agreement News Event Identification http://CrowdTruth.org Most of the city's monuments were destroyed including a magnificent tiled mosque which dominated the skyline for centuries. HIGHLIGHT the words/phrases that refer to an EVENT. 5 12 0 1 1 1 4 1 0 were destroyed including magnificent tiled mosque dominated skyline centuries 5 monuments 3 city 1 Most
  • 22. Measures how clearly a media unit expresses an annotation 1 0 1 1 0 0 4 3 0 0 5 1 0 Unit vector for annotation A6 Media Unit Vector Cosine = .55 Media Unit - Annotation Score http://CrowdTruth.org
  • 23. • Goal: what is the most accurate data labeling method? ○ CrowdTruth: media unit-annotation score (continuous value) ○ Majority Vote: decision of majority of workers (discrete) ○ Single: decision of a single worker, randomly sampled (discrete) ○ Expert: decision of domain expert (discrete) • Approach: evaluate with trusted labels set with either: ○ agreement between crowd and expert, or ○ manual evaluation when disagreement Experimental Setup http://CrowdTruth.org
  • 25. Evaluation: F1 score CrowdTruth performs better than Majority Vote, at least as well as Expert. Each task has different best thresholds in the media unit - annotation score.
  • 27. Evaluation: number of workers Each task reaches stable F1 for different number of workers. Majority Vote never beats CrowdTruth. Sound Interpretation needs more workers.
  • 28. • CrowdTruth performs just as well as domain experts • crowd is also cheaper • crowd is always available • capturing ambiguity is essential: • majority voting discards important signals in the data • using only a few annotators for ground truth is faulty • optimal number of workers / media unit is task dependent Experiments proved that: http://CrowdTruth.org
  • 29. CrowdTruth.org umitrache et al.: Empirical Methodology for Crowdsourcing Ground ruth. Semantic Web Journal Special Issue on Human Computation nd Crowdsourcing (HC&C) in the Context of the Semantic Web (in review).
  • 32. Ambiguity in Crowdsourcing http://CrowdTruth.org Media Unit Annotation Worker Disagreement can indicate ambiguity!