Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series

Lessons Learnt from the Named Entity
rEcognition and Linking (NEEL)
Challenge Series
Giuseppe Rizzo

Bianca Pereira

Andrea Varga

Marieke van Erp

Amparo Elizabeth Cano Basave
By Piet Mondrian - Gemeentemuseum Den Haag, Public Domain, https://commons.wikimedia.org/w/index.php?curid=37614350

NEEL Challenge Overview
• Microposts are challenging because:

• brevity (140 characters)

• (domain speciﬁc) abbreviations and
typos

• ‘grammar free’

• The NEEL challenge aims to explore new
approaches to foster research into novel,
more accurate entity recognition and linking
approaches tailored to Microposts

• NEEL ran from 2013 - 2016

NEEL Evolution
• 2013: Information Extraction

• named entity recognition (4 types)

• 2014: Named Entity Extraction and Linking (NEEL)

• named entity linking to DBpedia 3.9

• 2015: Named Entity rEcognition and Linking
(NEEL)

• named entity recognition (7 types) and
linking to DBpedia 2014

• 2016: Named Entity rEcognition and Linking
(NEEL)

• named entity recognition (7 types) and
linking to DBpedia 2015-04, NIL clustering
Image source: https://c1.staticﬂickr.com/8/7020/6405801675_efd6d09977_b.jpg

Cross-domain task
• Named Entity and Event Linking is a shared
task in NLP and Semantic Web

• Machine Learning approaches need data

• Data curation is expensive and hard

• Knowledge bases can reduce some of the
data bottleneck

• Resulting in hybrid approaches

Typical Entity Linking Workﬂow

Evaluating Entity Linking
• end-to-end: evaluates a system on the
aggregated output of all steps

• error propagation harms results

• step-by-step: robust benchmark that
evaluates each step of the process
individually

• time consuming to set up

• penalises systems that do not follow
standard workﬂow

• partial end-to-end: evaluates particular
steps in the process individually e.g. NER,
NIL & Linking

Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire

web sites

discussion forum posts
web sites

search
queries
technical
manuals

reports

formal discussion
tweets
tweets

Reddit

YouTube

StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial

end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial

end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW

NEEL Datasets
Image source: https://www.maxpixel.net/Word-Data-Data-Deluge-Binary-System-Binary-Dataset-2728117
• 2013: 4,265 tweets, end of 2010, start of
2011. No explicit hashtag search, 66% train,
33% test.

• 2014: 3,505 tweets, 15 July 2011 - 15 August
2011. First Story Detection algorithm to
identify tweet clusters representing events,
70% train, 30% test.

• 2015: 6,025 tweets, extension of 2014 dataset
including tweets from 2013 and November
2014. Train: 2014 dataset, 8% development,
34% test.

• 2016: 9,289 tweets, extension of 2014 & 2015
datasets via selection of hashtags. 65% train
(2015 datset), 1% development and 34% test.

NEEL Datasets (ctd)
• Entity types are not distributed equally

• Difficult to balance entity types over different
dataset slices

• Confusability: a measure of the number of surface
forms an entity can have (i.e. how many different
‘terms’ can refer to the same entity)

• Dominance: a measure of the number of
resources can be associated with a single surface
form (i.e. how many entities share the same
‘name’)
2013
2016
Confusability
Dominance

Results
• NEEL Challenge more difficult
every year (from 4 entity types to
7 + linking + NIL clustering)
• Systems more complex every
year
• 2016 task more difficult probably
due to domain specificity of test
dataset (US Primary Elections
and Star Wars)
Precision Recall F1
2013 0.764 0.604 0.67
2014 0.771 0.642 0.701
Tagging Clustering Linking Overall
2015 0.807 0.84 0.762 0.8067
2016 0.473 0.641 0.501 0.5486

Emerging Trends
• Tweet normalisation is common

• Use of KBs for mention detection and
typing

• End-to-end systems and pruning for
candidate selection

• Hierarchical clustering for aggregating
mentions of the same entity/event

• Decrease in the use of oﬀ-the-shelf
systems (which were popular in the ﬁrst
editions)

Lessons Learnt
• Creating balanced challenge datasets is hard!
• You are invited to expand and improve our
datasets!
• The datasets are available for evaluation of new
systems: http://
microposts2016.seas.upenn.edu/challenge.html
• NEEL provides an opportunity to compare
results against other systems
• Multilingual or other language challenges? (2016
also had an Italian variant)
• New popular micropost platforms require
different analyses

Acknowledgments:
Image source: https://upload.wikimedia.org/wikipedia/commons/d/de/The_Canadian_ﬁeld-naturalist_%281983%29_%2819897979884%29.jpg

Are you a Master’s or PhD student?
Do you want to learn how to do this type of research yourself?
Join us in Italy next summer!
http://semanticwebsummerschool.org

Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series

Recommended

Recommended

More Related Content

Similar to Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series

Similar to Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series (20)

More from Marieke van Erp

More from Marieke van Erp (20)

Recently uploaded

Recently uploaded (20)

Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series