We study the problem of finding sentences that explain the relationship between a named entity and an ad-hoc query, which we refer to as entity support sentences. This is an important sub-problem of entity ranking which, to the best of our knowledge, has not been addressed before. In this paper we give the first formalization of the problem, how it can be evaluated, and present a full evaluation dataset. We propose several methods to rank these sentences, namely retrieval-based, entity-ranking based and position-based. We found that traditional bag-of-words models perform relatively well when there is a match between an entity and a query in a given sentence, but they fail to find a support sentence for a substantial portion of entities. This can be improved by incorporating small windows of context sentences and ranking them appropriately.
7. Entity Ranking
• Given a topic, find relevant entities
• Evaluated in TREC and INEX campaigns
• Most well-known: people and expert search
• Many other applications: dates, events,
locations, companies, ...
8. Support Sentences for Entities
• We introduce the task explaining the
relationship between a query and an entity
• Applications for entity retrieval, expert
finding, object ranking, etc.
• We don’t focus on entity ranking, just on
the explanations
• Support Sentences: Hqe(s) ~ p(R|q,e,s)
• What makes a good sentence depends on
how general entities and queries are and
their relationship
11. Examples
• Query: Picasso and Peace
• Entity: 1944
•
“In 1944 Picasso joined the French Communist
Party, attended an international peace
conference in Poland, and in 1950 received the
Stalin Peace Prize from the Soviet government.”
12. Examples
• Query: Picasso and Peace
• Entity: Northern Spain
“Although it was not conceived by the
author as a representation of the disasters of
war, but the Nazi bombing of Guernica (a town
in Northern Spain), it is now considered an
iconic representation of the disasters of war.”
13. • Top-k sentenCceso ren-rtanekixngt (we don’t issue
any subsequent queries)
• Vocabulary mismatch problem (support
sentences that do not contain any query
term)
• Entity supported must be in the sentence
• Introduce small windows of context
sentences
15. Features for Ranking
• Top-k sentences
• Augmented
• Entity-candidate set
• Using sentence scores:
• Sentence score for the [query,sentence],
BM25
• Sentence score for the [query, sentence
+ context], BM25F
• Position:
16. • Aggregation of entity scores
• sum, max, min, average, ...
• Options for the entity ranker score E(q,e)
• Frequency
• Rarity
• Combination
• KLD
17. Evaluation Framework
• Semantically Annotated Snapshot of the
English Wikipedia (sentences +
annotations)
• 12 types from WSJ tag-set
• Judges produce a set of queries and remove
non-relevant entities
• Evaluate a set of sentences using a 4-grade
scale
• 226 (entity,query) with 45 unique queries
21. Conclusions
• We introduced the task of finding support
sentences for entities (aka “entity
snippets”)
• We engineered several features based on
scores of sentences and entities
• We developed an evaluation dataset
• http://barcelona.research.yahoo.net/dokuwiki
• Evaluated the task and the role of context
sentences