SlideShare a Scribd company logo
Evaluating Entity Linking: An Analysis of Current
Benchmark Datasets and a Roadmap for Doing a
Better Job
Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe
Rizzo and Joerg Waitelonis
https://github.com/dbpedia-spotlight/evaluation-datasets
Take home message
• Existing entity linking datasets:
• are not interoperable
• do not cover many different domains
• skew towards popular and frequent entities
• We need to:
• Document & Standardise
• Diversify to cover different domains and the long tail
https://github.com/dbpedia-spotlight/evaluation-datasets/ 2
Why
• Named entity linking approaches achieve F1 scores of ~.80
on various benchmark datasets
• Are we really testing our approaches on all aspects of the
entity linking task?
It’s not just us:
Maud Ehrmann and Damien Nouvel and
Sophie Rosset. Named Entity Resources -
Overview and Outlook. LREC 2016
https://github.com/dbpedia-spotlight/evaluation-datasets/ 3
This work
• Analysis of 7 entity linking benchmark datasets
• Dataset characteristics (document type, domain, license etc)
• Entity, surface form & mention characterisation (overlap
between datasets, confusability, prominence, dominance,
types, etc)
• Annotation characteristics (nested entities, redundancy, IAA,
offsets)
+ Roadmap: how can we do better
https://github.com/dbpedia-spotlight/evaluation-datasets/ 4
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 (5,596)
NEEL2014 (2,380)
NEEL2015 (2,800)
OKE2015 (531)
RSS500 (849)
WES2015 (7,309)
Wikinews (279)
https://github.com/dbpedia-spotlight/evaluation-datasets/ 5
Datasets
Dataset Type Domain Doc length Format Encoding License
AIDA-YAGO2 news general medium TSV ASCII Agreement
2014/2015
NEEL
tweets general short TSV ASCII Open
OKE2015 encyclopaedia general long NIF/RDF UTF8 Open
RSS-500 news general medium NIF/RDF UTF8 Open
WES2015 blog science long NIF/RDF UTF8 Open
WikiNews news general medium XML UTF8 Open
https://github.com/dbpedia-spotlight/evaluation-datasets/ 6
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 7
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 8
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 9
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 10
Confusability
• The number of meanings a surface form (mention) can have
11
Confusability
Corpus Average Min Max
AIDA-YAGO2 1.08 1 13 0.37
2014 NEEL 1.02 1 3 0.16
2015 NEEL 1.05 1 4 0.25
OKE2015 1.11 1 25 1.22
RSS500 1.02 1 3 0.16
WES2015 1.06 1 6 0.30
Wikinews 1.09 1 29 1.03
https://github.com/dbpedia-spotlight/evaluation-datasets/ 12
Dominance
Corpus Dominance Min Max
AIDA-YAGO2 .98 1 452 0.08
2014 NEEL .99 1 47 0.06
2015 NEEL .98 1 88 0.09
OKE2015 .98 1 1 0.11
RSS500 .99 1 1 0.07
WES2015 .97 1 1 0.12
Wikinews .99 1 72 0.09
https://github.com/dbpedia-spotlight/evaluation-datasets/ 13
Entity Types
https://github.com/dbpedia-spotlight/evaluation-datasets/ 14
Entity Types
15
Entity Prominance
https://github.com/dbpedia-spotlight/evaluation-datasets/ 16
DBpedia PageRank datasets:
http://people.aifb.kit.edu/ath/
How can we do better?
• Document your dataset!
• Use a standardised format
• Diversify both in domains and in entity distribution
https://github.com/dbpedia-spotlight/evaluation-datasets/ 17
Work in Progress & Future work
• Analyse more datasets
• Evaluate the temporal dimension of datasets (current work
by Filip Ilievski & Marten Postma)
• Integrate and build better datasets
https://github.com/dbpedia-spotlight/evaluation-datasets/ 18
Want to help?
Scripts and data used here can be found at:
Contact marieke.van.erp@vu.nl if you want to collaborate
https://github.com/dbpedia-spotlight/evaluation-datasets/
19
Shameless Advertising
NLP&DBpedia 2016
Workshop at ISWC2016
Submission deadline: 1 July
https://nlpdbpedia2016.wordpress.com/
20
Acknowledgements
https://github.com/dbpedia-spotlight/evaluation-datasets/

More Related Content

Similar to Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3)

Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
resyst
 
Evaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar PerencanaanEvaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar Perencanaan
Siti Sahati
 
柏瑞週報 20200214
柏瑞週報 20200214柏瑞週報 20200214
柏瑞週報 20200214
Pinebridge
 
BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015
Gypsychick
 
Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015
BuckDina
 
Forecasting enterprenuership 2311
Forecasting enterprenuership 2311Forecasting enterprenuership 2311
Forecasting enterprenuership 2311
sainath balasani
 
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
Christopher Brown
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
dgarijo
 
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by fundersCassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
SciELO - Scientific Electronic Library Online
 
柏瑞週報 20200605
柏瑞週報 20200605柏瑞週報 20200605
柏瑞週報 20200605
Pinebridge
 
柏瑞週報 20200424
柏瑞週報 20200424柏瑞週報 20200424
柏瑞週報 20200424
Pinebridge
 
Creating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationCreating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick Implementation
Lewandog, Inc,
 
Monitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access JournalMonitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access Journal
Ina Smith
 
柏瑞週報 20200508
柏瑞週報 20200508柏瑞週報 20200508
柏瑞週報 20200508
Pinebridge
 
Smart Beta Strategies for Global REITs Presentation ARES 2015
Smart Beta Strategies for Global REITs  Presentation ARES 2015Smart Beta Strategies for Global REITs  Presentation ARES 2015
Smart Beta Strategies for Global REITs Presentation ARES 2015
Consiliacapital
 
Portfolio mean and variance analysis
Portfolio mean and variance analysisPortfolio mean and variance analysis
Portfolio mean and variance analysis
Prashant S. Keswani
 
柏瑞週報 20200227
柏瑞週報 20200227柏瑞週報 20200227
柏瑞週報 20200227
Pinebridge
 
BILS 2015 Christoph Herwig
BILS 2015 Christoph HerwigBILS 2015 Christoph Herwig
BILS 2015 Christoph Herwig
GBX Events
 

Similar to Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3) (20)

Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
 
Evaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar PerencanaanEvaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar Perencanaan
 
柏瑞週報 20200214
柏瑞週報 20200214柏瑞週報 20200214
柏瑞週報 20200214
 
BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015
 
Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015
 
Forecasting enterprenuership 2311
Forecasting enterprenuership 2311Forecasting enterprenuership 2311
Forecasting enterprenuership 2311
 
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
 
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by fundersCassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
 
柏瑞週報 20200605
柏瑞週報 20200605柏瑞週報 20200605
柏瑞週報 20200605
 
柏瑞週報 20200424
柏瑞週報 20200424柏瑞週報 20200424
柏瑞週報 20200424
 
1Q07 Results
1Q07 Results1Q07 Results
1Q07 Results
 
Creating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationCreating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick Implementation
 
Monitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access JournalMonitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access Journal
 
柏瑞週報 20200508
柏瑞週報 20200508柏瑞週報 20200508
柏瑞週報 20200508
 
SCM PROJECT
SCM PROJECTSCM PROJECT
SCM PROJECT
 
Smart Beta Strategies for Global REITs Presentation ARES 2015
Smart Beta Strategies for Global REITs  Presentation ARES 2015Smart Beta Strategies for Global REITs  Presentation ARES 2015
Smart Beta Strategies for Global REITs Presentation ARES 2015
 
Portfolio mean and variance analysis
Portfolio mean and variance analysisPortfolio mean and variance analysis
Portfolio mean and variance analysis
 
柏瑞週報 20200227
柏瑞週報 20200227柏瑞週報 20200227
柏瑞週報 20200227
 
BILS 2015 Christoph Herwig
BILS 2015 Christoph HerwigBILS 2015 Christoph Herwig
BILS 2015 Christoph Herwig
 

More from Marieke van Erp

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
Marieke van Erp
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
Marieke van Erp
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
Marieke van Erp
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
Marieke van Erp
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
Marieke van Erp
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
Marieke van Erp
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
Marieke van Erp
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
Marieke van Erp
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Marieke van Erp
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Marieke van Erp
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Marieke van Erp
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Marieke van Erp
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
Marieke van Erp
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
Marieke van Erp
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
Marieke van Erp
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
Marieke van Erp
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction
Marieke van Erp
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Marieke van Erp
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Marieke van Erp
 
Orientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural HistoryOrientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural History
Marieke van Erp
 

More from Marieke van Erp (20)

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 
Orientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural HistoryOrientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural History
 

Recently uploaded

extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 

Recently uploaded (20)

extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 

Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3)

  • 1. Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe Rizzo and Joerg Waitelonis https://github.com/dbpedia-spotlight/evaluation-datasets
  • 2. Take home message • Existing entity linking datasets: • are not interoperable • do not cover many different domains • skew towards popular and frequent entities • We need to: • Document & Standardise • Diversify to cover different domains and the long tail https://github.com/dbpedia-spotlight/evaluation-datasets/ 2
  • 3. Why • Named entity linking approaches achieve F1 scores of ~.80 on various benchmark datasets • Are we really testing our approaches on all aspects of the entity linking task? It’s not just us: Maud Ehrmann and Damien Nouvel and Sophie Rosset. Named Entity Resources - Overview and Outlook. LREC 2016 https://github.com/dbpedia-spotlight/evaluation-datasets/ 3
  • 4. This work • Analysis of 7 entity linking benchmark datasets • Dataset characteristics (document type, domain, license etc) • Entity, surface form & mention characterisation (overlap between datasets, confusability, prominence, dominance, types, etc) • Annotation characteristics (nested entities, redundancy, IAA, offsets) + Roadmap: how can we do better https://github.com/dbpedia-spotlight/evaluation-datasets/ 4
  • 5. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 (5,596) NEEL2014 (2,380) NEEL2015 (2,800) OKE2015 (531) RSS500 (849) WES2015 (7,309) Wikinews (279) https://github.com/dbpedia-spotlight/evaluation-datasets/ 5
  • 6. Datasets Dataset Type Domain Doc length Format Encoding License AIDA-YAGO2 news general medium TSV ASCII Agreement 2014/2015 NEEL tweets general short TSV ASCII Open OKE2015 encyclopaedia general long NIF/RDF UTF8 Open RSS-500 news general medium NIF/RDF UTF8 Open WES2015 blog science long NIF/RDF UTF8 Open WikiNews news general medium XML UTF8 Open https://github.com/dbpedia-spotlight/evaluation-datasets/ 6
  • 7. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 7
  • 8. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 8
  • 9. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 9
  • 10. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 10
  • 11. Confusability • The number of meanings a surface form (mention) can have 11
  • 12. Confusability Corpus Average Min Max AIDA-YAGO2 1.08 1 13 0.37 2014 NEEL 1.02 1 3 0.16 2015 NEEL 1.05 1 4 0.25 OKE2015 1.11 1 25 1.22 RSS500 1.02 1 3 0.16 WES2015 1.06 1 6 0.30 Wikinews 1.09 1 29 1.03 https://github.com/dbpedia-spotlight/evaluation-datasets/ 12
  • 13. Dominance Corpus Dominance Min Max AIDA-YAGO2 .98 1 452 0.08 2014 NEEL .99 1 47 0.06 2015 NEEL .98 1 88 0.09 OKE2015 .98 1 1 0.11 RSS500 .99 1 1 0.07 WES2015 .97 1 1 0.12 Wikinews .99 1 72 0.09 https://github.com/dbpedia-spotlight/evaluation-datasets/ 13
  • 17. How can we do better? • Document your dataset! • Use a standardised format • Diversify both in domains and in entity distribution https://github.com/dbpedia-spotlight/evaluation-datasets/ 17
  • 18. Work in Progress & Future work • Analyse more datasets • Evaluate the temporal dimension of datasets (current work by Filip Ilievski & Marten Postma) • Integrate and build better datasets https://github.com/dbpedia-spotlight/evaluation-datasets/ 18
  • 19. Want to help? Scripts and data used here can be found at: Contact marieke.van.erp@vu.nl if you want to collaborate https://github.com/dbpedia-spotlight/evaluation-datasets/ 19
  • 20. Shameless Advertising NLP&DBpedia 2016 Workshop at ISWC2016 Submission deadline: 1 July https://nlpdbpedia2016.wordpress.com/ 20