Evaluating Named Entity Recognition and Disambiguation in News and Tweets

EVALUATING NAMED ENTITY
RECOGNITION AND DISAMBIGUATION
IN NEWS AND TWEETS
Giuseppe Rizzo

Università degli studi di Torino

Marieke van Erp

VU University Amsterdam

Raphaël Troncy

EURECOM

EVALUATING NER & NED
•

NER typically an NLP task (MUC, CoNLL, ACE)

•

NED took ﬂight with availability of large structured
resources (Wikipedia, DBpedia, Freebase)

•

Tools for NER and NED have started popping up outside
regular research outlets (TextRazor, DBpedia Spotlight,
AlchemyAPI)

•

Unclear how well these tools perform

THIS WORK
•

Evaluation & comparison of 10 out-of-the-box
NER and NED tools through NERD API as well as
a combination of the tools in NERD-ML

•

Two types of data: Newswire & Tweets

•

http://nerd.eurecom.fr

•

Ontology, REST API & UI

•

Uniform access to 12 different extractors/linkers:
AlchemyAPI, DBpedia Spotlight, Extractiv, Lupedia,
OpenCalais, Saplo, SemiTags, TextRazor, THD, Wikimeta,
Yahoo! Content Analysis, Zemanta

NERD-ML
•

The aim of NERD-ML is to combine the
knowledge of the different extractors into a better
named entity recogniser

•

Uses NERD predictions, Stanford NER & extra
features

•

Naive Bayes, k-NN, SMO

DATA
•

CoNLL 2003 English NER with AIDA CoNLL-YAGO
links to Wikipedia (5,648 NEs/4,485 links in test set)

•

Making Sense of Microposts 2013 (MSM’13) for
NER on Twitter domain + 62 randomly selected
tweets from Ritter et al.’s corpus with links to
DBpedia resources (MSM: 1,538 NEs/Ritter: 177
links in test set)

PER

LOC

ORG

Precision
Precision

MISC

OVERALL

100
90
80
70
60
50
40
30
20
10
0

0

10

20

30

40

50

60

70

80

90

100

RESULTS NER NEWSWIRE

PER

LOC

ORG
Recall

Recall

MISC

OVERALL

PER

LOC

ORG

MISC

OVERALL

F1

F1

AlchemyAPI
DBpedia Spotlight
Extractiv
Lupedia
OpenCalais
Saplo
Textrazor
Yahoo
Wikimeta
Zemanta
Stanford NER
NERD-ML Run01
NERD-ML Run02
NERD-ML Run03
Upper Limit

PER

LOC

ORG
Precision

Precision

MISC

OVERALL

100
0

10

20

30

40

50

60

70

80

90

100
90
80
70
60
50
40
30
20
10
0

0

10

20

30

40

50

60

70

80

90

100

RESULTS NER MSM

PER

LOC

ORG
Precision

Recall

MISC

OVERALL

PER

LOC

ORG

MISC

OVERALL

Precision

F1

AlchemyAPI
DBpedia Spotlight
Extractiv
Lupedia
OpenCalais
Saplo
Textrazor
Wikimeta
Zemanta
Ritter et al.
Stanford NER
NERD-ML Run01
NERD-ML Run02
NERD-ML Run03
Upper Limit

RESULTS NED
DBpedia
AlchemyAPI
Extractiv Lupedia Textrazor Yahoo Zemanta
Spotlight
AIDAYAGO

70.63

26.93

51.31

57.98

49.21

0.0

35.58

TWEETS

53.85

25.13

74.07

65.38

58.14

76.00

48.57

DISCUSSION
•

Still a ways to go, but for certain classes NER is
getting close to really good results

•

MISC class is (and probably always will be?) hard

•

Bigger datasets needed (for tweets and NED)

•

NED task can use standardisation

THANK YOU FOR LISTENING

•

Try out our code at:
https://github.com/giusepperizzo/nerdml

ACKNOWLEDGEMENTS
This research is funded through the LinkedTV
and NewsReader projects, both funded by the
European Union’s 7th Framework Programme
grants GA 287911 and ICT-316404).

Evaluating Named Entity Recognition and Disambiguation in News and Tweets

More Related Content

What's hot

Viewers also liked

Similar to Evaluating Named Entity Recognition and Disambiguation in News and Tweets

More from Marieke van Erp

Recently uploaded

Evaluating Named Entity Recognition and Disambiguation in News and Tweets