Building a Graph of Names and Contextual Patterns for Named Entity Classification

Building a Graph of Names and Contextual Patterns
for Named Entity Classiﬁcation
César de Pablo Sánchez and Paloma Martínez
Computer Science Department
Universidad Carlos III de Madrid
31st European Conference on Information Retrieval
Tolouse, 6-9 April
César de Pablo Sánchez and Paloma Martínez Building a Graph of Names and Contextual Patterns for Named Entity Cla

Motivation and Objectives
NERC for multilingual applications: annotation is the bottleneck
Bootstrap names lists and indicative patterns for classiﬁcation
Large document collection
NE classes of interest (PERSON, LOCATION, ORG . . . )
Name seeds for every class
(Towards) language independence (regexp, stopwords)

Assumptions
Dual bootstrapping: entities → patterns → entities
One sense per entity type (name)
Counter-training: learn several classes at once
Query based exploration of the indexed collection.

Pattern expansion

Entity expansion

Results: Direct Evaluation
Language: Spanish
Collection: EFE 94 95 1GB newswire (CLEF)
NE Classes: PLO (PERSON, LOCATION, ORG), +MISC, +TEAM
seeds per class < 40 , <1h work/person
Evaluation: sample acquired name lists

Results: Direct Evaluation
Model PER LOC ORG M / T Mean
PLO 94.8 52.7 67.1 – 71.5
PLOM 93.0 44.8 79.3 75.0 73.0
PLOT 94.8 87.4 81.1 40.9 76.0

Results: Name Classiﬁcation
Indirect Evaluation
Evaluation: CONLL 2002 Shared Task, Spanish EFE 2000
Model P R F Acc
baseline
CONLL 26.27 56.48 35.86 –
ORG – – – 39.34
entities
PLO 77.33 54.34 63.83 64.04
PLOM 78.85 51.53 62.36 66.24
PLOT 78.72 41.58 54.42 62.18
entities+patterns
PLO 66.12 57.97 61.78 63.17
PLOM 73.65 61.73 67.17 71.29
PLOT 66.35 56.62 61.10 62.50

Conclusions and Future Work
Efﬁcient bootstrapping from large indexed collections with less
seeds
Already useful for NERC, performance is lower than supervised
machine learning
More classes improves precision, not always recall
Future work:
Other languages and domains
Complex semantic models
Language independence and NE Recognition

Building a Graph of Names and Contextual Patterns for Named Entity Classification

More Related Content

Viewers also liked

Similar to Building a Graph of Names and Contextual Patterns for Named Entity Classification

More from Grupo HULAT

Recently uploaded

Building a Graph of Names and Contextual Patterns for Named Entity Classification