SANAPHOR: Ontology-Based Coreference
Resolution
Roman Prokofyev, Alberto Tonon, Michael Luggen, Loic Vouilloz,
Djellel Difallah and Philippe Cudré-Mauroux
eXascale Infolab
University of Fribourg, Switzerland
October 14th, ISWC’15
Bethlehem PA, USA
1
Motivations and Task Overview
2
Task: identify groups (cluster) of co-referring mentions.
Example:
“Xi Jinping was due to arrive in Washington for a dinner with
Barack Obama on Thursday night, in which he will aim to
reassure the US president about a rising China. The Chinese
president said he favors a “new model of major country
relationship" built on understanding, rather than suspicion.”
http://www.telegraph.co.uk/
Benefits:
• identification of a specific type of an unknown entity
• extract more relationships between named entities
State-of-Art in Coreference Resolution
Best approaches use generic multi-step algorithm:
1. Pre-processing (POS tagging, parsing, NER)
2. Identification of referring expressions (e.g., pronouns)
3. Anaphoricity determination (“it rains” vs “he took it”)
4. Generation of antecedent candidates
5. Searching/Clustering of candidates
Lee et al., Stanford’s multi-pass sieve coreference resolution system at the
conll-2011 shared task
3
Motivations for a rich semantic layer
4
http://www.telegraph.co.uk/
“Xi Jinping was due to arrive in Washington for a dinner
with Barack Obama on Thursday night, in which he will
aim to reassure the US president about a rising China.
The Chinese president said he favors a “new model of
major country relationship" built on understanding, rather
than suspicion.”
Syntactic approaches are not able to differentiate between
the names of the city and the province.
Semantic layer on top of an existing system
5
Stanford Coref
Deterministic Coreference
Resolution
[US President] [Barack Obama]
[Australia] [Quintex Australia]
[Quintex ltd.]
Documents
Generic overview of the approach
Key techniques
Split and merge clusters based on their semantics.
6
Clusters produced
by Stanford Coref
Entity/Type
Linking
Split
clusters
Merge
clusters
SANAPHOR
Pre-Processing: Entity Linking
7
Entity Linking
US President Barack Obama
Australia
Quintex Australia
Quintex ltd.
US President e1: Barack Obama
e2: Australia
e3: Quintex Australia
e3: Quintex ltd.
Pre-Processing: Semantic Typing
8
Semantic Typing:
recognized entities are
typed, other mention are
typed by string similarity
with YAGO.
YAGO Index
US President e1: Barack Obama
t1: US President e1: Barack Obama
Cluster splits
9
Entity- and Type-based splitting on clusters
(e2: Australia) (e3: Quintex Australia) (e3: Quintex ltd.)
e3: Quintex Australia
e3: Quintex ltd.
e2: Australia
Cluster splits: heuristics
10
1. Non-identified mention assignment – based on
exclusive words in each cluster:
Obama ⇒ Barack Obama
Jinping ⇒ Xi Jinping
2. Ignore complete subsets of other identified
mentions:
✕ Aspen (“Aspen Airways”)
✕ Obama (“Barack Obama”)
Cluster merges
11
Merge different clusters that
contain the same types/entities
t1: US President e1: Barack Obama
(e1: Barack Obama) (t1:US President)
Evaluation
CoNLL-2012 Shared Task on Coference Resolution:
• over 1M words
• 3 parts: development, training and test.
Design methods based on dev, evaluate on test.
Metrics:
• Precision/Recall/F1 for the case of clustering
• Evaluate noun-only clusters separately (no pronouns)
12
Cluster linking statistics
13
0 entities 1 entity 2 entities 3 entities
All Clusters 4175 849 49 5
Noun-Only Clusters 1208 502 33 2
Total clusters (Stanford Coref): 5078
To be merged To be split
All Clusters 270 118
Noun-Only Clusters 77 52
Cluster optimization results
14
• System improves on top of Stanford Coref in both split
and merge tasks.
• Greater improvement in split task for noun-only clusters,
since we do not re-assign pronouns.
Conclusions
• Leveraging semantic information improves coreference
resolution on top of existing NLP systems.
• The performance improves with the improvement of entity
and type linking.
• Complete evaluation code available at:
https://github.com/xi-lab/sanaphor
15
Roman Prokofyev (@rprokofyev)
eXascale Infolab (exascale.info), University of Fribourg, Switzerland
http://www.slideshare.net/eXascaleInfolab/
Metrics
• True positive (TP) - two similar documents to the same
cluster.
• True negative (TN) - two dissimilar documents to different
clusters.
• False positive (FP) - two dissimilar documents to the
same cluster.
• False negative (FN) - two similar documents to different
clusters.
17
Editor's Notes
Welcome everyone, my name is Roman Prokofyev, and I’m a PhD student at the eXascale Infolab, at the University of Fribourg, Switzerland.
And today I will present you our joint work on ontology-based conreference resolution.
I’ll start with a overview of the task we are solving here.
So, currently, the standard way to resolve coreferences is by means of a multi-step approach which was developed over the years.
However, NLP-based approach fails to determine correct coreference cluster when the referring phrases are somewhat ambiguous.
In our work we introduce a semantic layer on top of an existing system will allow us to rearrange coreference clusters based on their semantics.
Stanford produces so-called clusters…
Thus, we have designed the following pipeline for our system.
Let’s see how each box operates in detail.
First step of our pipeline, …
spotlight – decent technology
beyond EL, semantic typing, the next pre-processing step is semantic typing…
Now, after we completed the necessary pre-processing steps, we start re-arranging the coreference clusters.
The first step is to split semantically unrelated clusters, which means that clusters contain either different entities or types from different branches of hierarchy.
We identified the following problems
Second step is cluster merging, that is, cluster that either contain same entities, or exactly same types, or, in case there is a mix of types and entities,…
Ontonotes 5: available on LDC for free,
1M words from newswire, magazine articles, web data
First, we evaluate the quality of our entity linking step
we notice that the absolute increase in F1 score for the split task is greater for the Noun-Only case (+10.54% vs +2.94%). This results from the fact that All Clusters also contain non-noun mentions, such as pronouns, which we don’t directly tackle in this work but have to be assigned to one of the splits nevertheless. Our approach in that context is to keep the non-noun mentions with the first noun-mention in the cluster, which seems to be suboptimal for this case.
For the merge task, the difference between All and Noun-Only clusters is much smaller (+27.03% for the All Clusters vs +18.96% for the Noun-Only case). In this case, non-noun words do not have any effect, since we merge clusters and also include all other mentions.