SlideShare a Scribd company logo
Anca Dumitrache, Lora Aroyo, Chris Welty
http://CrowdTruth.org
Achieving Expert-Level
Annotation Quality with the Crowd
The Case of Medical Relation Extraction
Biomedical Data Mining, Modeling & Semantic Integration
@ ISWC2015
#CrowdTruth @anouk_anca @laroyo @cawelty #BDM2I
•  Annotator disagreement is
signal, not noise.
•  It is indicative of the
variation in human
semantic interpretation of
signs
•  It can indicate ambiguity,
vagueness, similarity, over-
generality, etc,
as well as quality
CrowdTruth	
  
http://CrowdTruth.org
•  Goals:
collecting a relation extraction
gold standard
improve the performance of a
relation extraction classifier
•  Approach:
crowdsource 900 medical
sentences
measure disagreement with
CrowdTruth metrics
train & evaluate classifier with
CrowdTruth score
CrowdTruth	
  for	
  
medical	
  rela2on	
  
extrac2on	
  
http://CrowdTruth.org
RelEx	
  TASK	
  in	
  CrowdFlower	
  
Pa2ents	
  with	
  ACUTE	
  FEVER	
  and	
  nausea	
  could	
  be	
  suffering	
  
from	
  INFLUENZA	
  AH1N1	
  
Is	
  ACUTE	
  FEVER	
  –	
  related	
  to	
  →	
  INFLUENZA	
  AH1N1?	
  
h"p://CrowdTruth.org	
  	
  
1 1 1
Worker	
  Vector	
  
h"p://CrowdTruth.org	
  	
  
1 1 1
1 1
1
1 1
1 1
1 1
1
1
1
0 1 1 0 0 4 3 0 0 5 1 0
Sentence	
  Vector	
  
h"p://CrowdTruth.org	
  	
  
0.907,	
  p	
  =	
  0:007	
  
0.844	
  
Annota2on	
  Quality	
  	
  
of	
  Expert	
  vs.	
  Crowd	
  Annota2ons	
  
h"p://CrowdTruth.org	
  	
  
0.907,	
  p	
  =	
  0:007	
  
0.844	
  
[0.6	
  -­‐	
  0.8]	
  crowd	
  significantly	
  out-­‐performs	
  expert	
  	
  
with	
  max	
  in	
  0.907	
  F1	
  @	
  0.7	
  threshold	
  
Annota2on	
  Quality	
  	
  
of	
  Expert	
  vs.	
  Crowd	
  Annota2ons	
  
h"p://CrowdTruth.org	
  	
  
0.642,	
  p	
  =	
  0:016	
  	
  
0.638	
  
Relex	
  CAUSE	
  Classifier	
  F1	
  	
  
for	
  Crowd	
  vs.	
  Expert	
  Annota2ons	
  
h"p://CrowdTruth.org	
  	
  
0.642,	
  p	
  =	
  0:016	
  	
  
0.638	
  
crowd	
  provides	
  training	
  data	
  that	
  is	
  at	
  least	
  as	
  good	
  
if	
  not	
  beEer	
  than	
  experts	
  
Relex	
  CAUSE	
  Classifier	
  F1	
  	
  
for	
  Crowd	
  vs.	
  Expert	
  Annota2ons	
  
h"p://CrowdTruth.org	
  	
  
(crowd	
  with	
  pos./neg.	
  threshold	
  at	
  0.5)	
  
h"p://CrowdTruth.org	
  	
  
Learning	
  Curves	
  
Learning	
  Curves	
  
(crowd	
  with	
  pos./neg.	
  threshold	
  at	
  0.5)	
  
above	
  400	
  sent.:	
  crowd	
  consistently	
  over	
  baseline	
  &	
  single	
  
above	
  600	
  sent.:	
  crowd	
  out-­‐performs	
  experts	
  
h"p://CrowdTruth.org	
  	
  
Learning	
  Curves	
  Extended	
  
(crowd	
  with	
  pos./neg.	
  threshold	
  at	
  0.5)	
  
h"p://CrowdTruth.org	
  	
  
Learning	
  Curves	
  Extended	
  
(crowd	
  with	
  pos./neg.	
  threshold	
  at	
  0.5)	
  
h"p://CrowdTruth.org	
  	
  
crowd	
  consistently	
  performs	
  beEer	
  than	
  baseline	
  
#	
  of	
  Workers:	
  Impact	
  on	
  Sentence-­‐Rela2on	
  Score	
  
h"p://CrowdTruth.org	
  	
  
#	
  of	
  Workers:	
  Impact	
  on	
  Annota2on	
  Quality	
  
only	
  54	
  sent.	
  had	
  15	
  or	
  more	
  workers	
  
h"p://CrowdTruth.org	
  	
  
Experts	
  vs.	
  Crowd	
  	
  
in	
  Human	
  Annota2on	
  
Overall	
  Comparison	
  
•  91% of expert annotations covered by the crowd
•  expert annotators reach agreement only in 30%
•  most popular crowd vote covers 95% of this
expert annotation agreement
	
  
h"p://CrowdTruth.org	
  	
  
F1 Cost per
sentence
CrowdTruth 0.642 $0.66
Expert Annotator 0.638 $2.00
Single Annotator 0.492 $0.08
h"p://CrowdTruth.org	
  	
  
Expert	
  vs.	
  Crowd	
  	
  
in	
  Human	
  Annota2on	
  
Cost	
  Comparison	
  
•  crowd performs just as well as
medical experts
•  crowd is also cheaper
•  crowd is always available
•  using only a few annotators for
ground truth is faulty
•  min 10 workers/sentence are
needed for highest quality
annotations
•  CrowdTruth = a solution to Clinical
NLP Challenge:
•  lack of ground truth for training &
benchmarking
Experiments
proved	
  that:	
  
http://CrowdTruth.org
#CrowdTruth @anouk_anca @laroyo @cawelty #BDM2I #ISWC2015
CrowdTruth.org
http://data.CrowdTruth.org/medical-relex

More Related Content

Viewers also liked

Robotics and Embedded Systems
Robotics and Embedded SystemsRobotics and Embedded Systems
Robotics and Embedded Systems
Ankan Naskar
 
ACTION RESEARCH
ACTION RESEARCHACTION RESEARCH
ACTION RESEARCH
Parvathy V
 
SOCIAL LEARNING THEORY
SOCIAL LEARNING THEORYSOCIAL LEARNING THEORY
SOCIAL LEARNING THEORY
Parvathy V
 
Генрі Форд
Генрі ФордГенрі Форд
Генрі Форд
Marina Hybalo
 
Kyle cooper titile sequence
Kyle cooper  titile sequenceKyle cooper  titile sequence
Kyle cooper titile sequence
Rochelle777
 
[ETUDE] Les Francais, l'épargne et la retraite
[ETUDE] Les Francais, l'épargne et la retraite[ETUDE] Les Francais, l'épargne et la retraite
[ETUDE] Les Francais, l'épargne et la retraite
AG2R LA MONDIALE
 

Viewers also liked (6)

Robotics and Embedded Systems
Robotics and Embedded SystemsRobotics and Embedded Systems
Robotics and Embedded Systems
 
ACTION RESEARCH
ACTION RESEARCHACTION RESEARCH
ACTION RESEARCH
 
SOCIAL LEARNING THEORY
SOCIAL LEARNING THEORYSOCIAL LEARNING THEORY
SOCIAL LEARNING THEORY
 
Генрі Форд
Генрі ФордГенрі Форд
Генрі Форд
 
Kyle cooper titile sequence
Kyle cooper  titile sequenceKyle cooper  titile sequence
Kyle cooper titile sequence
 
[ETUDE] Les Francais, l'épargne et la retraite
[ETUDE] Les Francais, l'épargne et la retraite[ETUDE] Les Francais, l'épargne et la retraite
[ETUDE] Les Francais, l'épargne et la retraite
 

Similar to #CrowdTruth: Biomedical Data Mining, Modeling & Semantic Integration (BDM2I 2015) @ISWC2015

CrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
CrowdTruth Tutorial: Using the Crowd to Understand AmbiguityCrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
CrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
Anca Dumitrache
 
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
Lora Aroyo
 
TRADELINE_2007_Academic Medical Center Conference
TRADELINE_2007_Academic Medical Center ConferenceTRADELINE_2007_Academic Medical Center Conference
TRADELINE_2007_Academic Medical Center Conference
Upali Nanda
 
CrowdTruth for medical relation extraction - WAI talk
CrowdTruth for medical relation extraction - WAI talkCrowdTruth for medical relation extraction - WAI talk
CrowdTruth for medical relation extraction - WAI talk
Anca Dumitrache
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
Amazon Web Services
 
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
Endeavor Management
 
cPNI for pni2.org
cPNI for pni2.orgcPNI for pni2.org
cPNI for pni2.org
Harold van Garderen
 
Open Targets workshop at C4X in 2019
Open Targets workshop at C4X in 2019Open Targets workshop at C4X in 2019
Open Targets workshop at C4X in 2019
Denise Carvalho-Silva, PhD
 
CrowdTruth @DIR2015
CrowdTruth @DIR2015CrowdTruth @DIR2015
CrowdTruth @DIR2015
Anca Dumitrache
 
ClearWeave White Paper.pdf
ClearWeave White Paper.pdfClearWeave White Paper.pdf
ClearWeave White Paper.pdf
RyanCasey60
 
The Evidence-Based Organization: A Platform for Innovation
The Evidence-Based Organization: A Platform for InnovationThe Evidence-Based Organization: A Platform for Innovation
The Evidence-Based Organization: A Platform for Innovation
Jan Recker @ University of Hamburg
 
1530 track1 rosenbaum
1530 track1 rosenbaum1530 track1 rosenbaum
1530 track1 rosenbaum
Rising Media, Inc.
 
Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Fore FAIR ISMB 2019
Fore FAIR ISMB 2019
Ian Fore
 
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)Gareth Evans
 
Heathcare Communicators Oregon Presentation
Heathcare Communicators Oregon PresentationHeathcare Communicators Oregon Presentation
Heathcare Communicators Oregon Presentation
CFM Strategic Communications
 
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docxImpact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
4934bk
 
Top 25 location R&D Hardware
Top 25 location R&D HardwareTop 25 location R&D Hardware
Top 25 location R&D HardwareCEB TalentNeuron
 
FHIR intro and background at HL7 Germany 2014
FHIR intro and background at HL7 Germany 2014FHIR intro and background at HL7 Germany 2014
FHIR intro and background at HL7 Germany 2014
Ewout Kramer
 
Automated and Explainable Deep Learning for Clinical Language Understanding a...
Automated and Explainable Deep Learning for Clinical Language Understanding a...Automated and Explainable Deep Learning for Clinical Language Understanding a...
Automated and Explainable Deep Learning for Clinical Language Understanding a...
Databricks
 
Closing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
Closing the Gap: Bringing a Consumer-Like Experience to the Digital WorkplaceClosing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
Closing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
Lucidworks
 

Similar to #CrowdTruth: Biomedical Data Mining, Modeling & Semantic Integration (BDM2I 2015) @ISWC2015 (20)

CrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
CrowdTruth Tutorial: Using the Crowd to Understand AmbiguityCrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
CrowdTruth Tutorial: Using the Crowd to Understand Ambiguity
 
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
 
TRADELINE_2007_Academic Medical Center Conference
TRADELINE_2007_Academic Medical Center ConferenceTRADELINE_2007_Academic Medical Center Conference
TRADELINE_2007_Academic Medical Center Conference
 
CrowdTruth for medical relation extraction - WAI talk
CrowdTruth for medical relation extraction - WAI talkCrowdTruth for medical relation extraction - WAI talk
CrowdTruth for medical relation extraction - WAI talk
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
 
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
10 Must Know Techniques for Managing Physician Relations in Today's Digital W...
 
cPNI for pni2.org
cPNI for pni2.orgcPNI for pni2.org
cPNI for pni2.org
 
Open Targets workshop at C4X in 2019
Open Targets workshop at C4X in 2019Open Targets workshop at C4X in 2019
Open Targets workshop at C4X in 2019
 
CrowdTruth @DIR2015
CrowdTruth @DIR2015CrowdTruth @DIR2015
CrowdTruth @DIR2015
 
ClearWeave White Paper.pdf
ClearWeave White Paper.pdfClearWeave White Paper.pdf
ClearWeave White Paper.pdf
 
The Evidence-Based Organization: A Platform for Innovation
The Evidence-Based Organization: A Platform for InnovationThe Evidence-Based Organization: A Platform for Innovation
The Evidence-Based Organization: A Platform for Innovation
 
1530 track1 rosenbaum
1530 track1 rosenbaum1530 track1 rosenbaum
1530 track1 rosenbaum
 
Fore FAIR ISMB 2019
Fore FAIR ISMB 2019Fore FAIR ISMB 2019
Fore FAIR ISMB 2019
 
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)
Evans-Metrics-that-Matter-Inside-Counsel-1.2015 (1)
 
Heathcare Communicators Oregon Presentation
Heathcare Communicators Oregon PresentationHeathcare Communicators Oregon Presentation
Heathcare Communicators Oregon Presentation
 
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docxImpact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
Impact of Nursing Informatics on Patient Outcomes Care Efficiencies.docx
 
Top 25 location R&D Hardware
Top 25 location R&D HardwareTop 25 location R&D Hardware
Top 25 location R&D Hardware
 
FHIR intro and background at HL7 Germany 2014
FHIR intro and background at HL7 Germany 2014FHIR intro and background at HL7 Germany 2014
FHIR intro and background at HL7 Germany 2014
 
Automated and Explainable Deep Learning for Clinical Language Understanding a...
Automated and Explainable Deep Learning for Clinical Language Understanding a...Automated and Explainable Deep Learning for Clinical Language Understanding a...
Automated and Explainable Deep Learning for Clinical Language Understanding a...
 
Closing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
Closing the Gap: Bringing a Consumer-Like Experience to the Digital WorkplaceClosing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
Closing the Gap: Bringing a Consumer-Like Experience to the Digital Workplace
 

More from Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
Lora Aroyo
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
Lora Aroyo
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
Lora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
Lora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
Lora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
Lora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
Lora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Lora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
Lora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
Lora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
Lora Aroyo
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
Lora Aroyo
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Lora Aroyo
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
Lora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
Lora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
Lora Aroyo
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Lora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
Lora Aroyo
 

More from Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 

#CrowdTruth: Biomedical Data Mining, Modeling & Semantic Integration (BDM2I 2015) @ISWC2015

  • 1. Anca Dumitrache, Lora Aroyo, Chris Welty http://CrowdTruth.org Achieving Expert-Level Annotation Quality with the Crowd The Case of Medical Relation Extraction Biomedical Data Mining, Modeling & Semantic Integration @ ISWC2015 #CrowdTruth @anouk_anca @laroyo @cawelty #BDM2I
  • 2. •  Annotator disagreement is signal, not noise. •  It is indicative of the variation in human semantic interpretation of signs •  It can indicate ambiguity, vagueness, similarity, over- generality, etc, as well as quality CrowdTruth   http://CrowdTruth.org
  • 3. •  Goals: collecting a relation extraction gold standard improve the performance of a relation extraction classifier •  Approach: crowdsource 900 medical sentences measure disagreement with CrowdTruth metrics train & evaluate classifier with CrowdTruth score CrowdTruth  for   medical  rela2on   extrac2on   http://CrowdTruth.org
  • 4. RelEx  TASK  in  CrowdFlower   Pa2ents  with  ACUTE  FEVER  and  nausea  could  be  suffering   from  INFLUENZA  AH1N1   Is  ACUTE  FEVER  –  related  to  →  INFLUENZA  AH1N1?   h"p://CrowdTruth.org    
  • 5. 1 1 1 Worker  Vector   h"p://CrowdTruth.org    
  • 6. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 4 3 0 0 5 1 0 Sentence  Vector   h"p://CrowdTruth.org    
  • 7. 0.907,  p  =  0:007   0.844   Annota2on  Quality     of  Expert  vs.  Crowd  Annota2ons   h"p://CrowdTruth.org    
  • 8. 0.907,  p  =  0:007   0.844   [0.6  -­‐  0.8]  crowd  significantly  out-­‐performs  expert     with  max  in  0.907  F1  @  0.7  threshold   Annota2on  Quality     of  Expert  vs.  Crowd  Annota2ons   h"p://CrowdTruth.org    
  • 9. 0.642,  p  =  0:016     0.638   Relex  CAUSE  Classifier  F1     for  Crowd  vs.  Expert  Annota2ons   h"p://CrowdTruth.org    
  • 10. 0.642,  p  =  0:016     0.638   crowd  provides  training  data  that  is  at  least  as  good   if  not  beEer  than  experts   Relex  CAUSE  Classifier  F1     for  Crowd  vs.  Expert  Annota2ons   h"p://CrowdTruth.org    
  • 11. (crowd  with  pos./neg.  threshold  at  0.5)   h"p://CrowdTruth.org     Learning  Curves  
  • 12. Learning  Curves   (crowd  with  pos./neg.  threshold  at  0.5)   above  400  sent.:  crowd  consistently  over  baseline  &  single   above  600  sent.:  crowd  out-­‐performs  experts   h"p://CrowdTruth.org    
  • 13. Learning  Curves  Extended   (crowd  with  pos./neg.  threshold  at  0.5)   h"p://CrowdTruth.org    
  • 14. Learning  Curves  Extended   (crowd  with  pos./neg.  threshold  at  0.5)   h"p://CrowdTruth.org     crowd  consistently  performs  beEer  than  baseline  
  • 15. #  of  Workers:  Impact  on  Sentence-­‐Rela2on  Score   h"p://CrowdTruth.org    
  • 16. #  of  Workers:  Impact  on  Annota2on  Quality   only  54  sent.  had  15  or  more  workers   h"p://CrowdTruth.org    
  • 17. Experts  vs.  Crowd     in  Human  Annota2on   Overall  Comparison   •  91% of expert annotations covered by the crowd •  expert annotators reach agreement only in 30% •  most popular crowd vote covers 95% of this expert annotation agreement   h"p://CrowdTruth.org    
  • 18. F1 Cost per sentence CrowdTruth 0.642 $0.66 Expert Annotator 0.638 $2.00 Single Annotator 0.492 $0.08 h"p://CrowdTruth.org     Expert  vs.  Crowd     in  Human  Annota2on   Cost  Comparison  
  • 19. •  crowd performs just as well as medical experts •  crowd is also cheaper •  crowd is always available •  using only a few annotators for ground truth is faulty •  min 10 workers/sentence are needed for highest quality annotations •  CrowdTruth = a solution to Clinical NLP Challenge: •  lack of ground truth for training & benchmarking Experiments proved  that:   http://CrowdTruth.org
  • 20. #CrowdTruth @anouk_anca @laroyo @cawelty #BDM2I #ISWC2015 CrowdTruth.org http://data.CrowdTruth.org/medical-relex