Advertisement

Entity Linking in Queries: Tasks and Evaluation

Assistant Professor in Information Retrieval (IR)
Dec. 5, 2015
Advertisement

More Related Content

Slideshows for you(20)

Similar to Entity Linking in Queries: Tasks and Evaluation(20)

Advertisement

Entity Linking in Queries: Tasks and Evaluation

  1. Entity Linking in Queries: Tasks and Evaluation ICTIR conference, September 2015 Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg
  2. 2 Entity linking Definition from Wikipedia: https://en.wikipedia.org/wiki/Natural_language_processing https://en.wikipedia.org/wiki/Statistical_classification
  3. 3 Why entity linking in queries? • ~70% of queries contain entities • To exploit semantic representation of queries Improves: • Ad-hoc document retrieval • Entity retrieval • Query understanding • Understanding users’ task (Tasks track, TREC) J. Pound, P. Mika, and H. Zaragoza. Ad-hoc object retrieval in the web of data. In Proc. of WWW '10. tom cruise movies semantic representation <relation>
  4. 4 It is different … Different from conventional entity linking: • Limited or even no context • A mention may be linked to more than one entity France France national football teamFIFA world cup france world cup 98 {France, FIFA world cup} or {France national football team, FIFA world cup}
  5. 5 In this talk How entity linking should be performed for queries? ➤ Task: “Semantic Mapping” or “Interpretation Finding”? ➤ Evaluation metrics ➤ Test collections ➤ Methods
  6. 6 In this talk How entity linking should be performed for queries? ➤ Task: “Semantic Mapping” or “Interpretation Finding”? ➤ Evaluation metrics ➤ Test collections ➤ Methods
  7. 7 Entity linking • Output is set of entities • Each mention is linked to a single entity • Mentions do not overlap • Entities are explicitly mentioned obama mother the music man new york pizza manhattan {Barack Obama} {The Music Man} {New York City, Manhattan}
  8. 8 Semantic mapping • Output is ranked list of entities • Mentions can overlap and be linked to multiple entities • Entities may not be explicitly mentioned • Entities do not need to form semantically compatible sets • False positive are not penalized obama mother the music man new york pizza manhattan Ann Dunham Barack Obama The Music Man The Music Man (1962 film) The Music Man (2003 film) … New York City New York-style pizza Manhattan Manhattan pizza ...
  9. 9 Interpretation finding • Output is set(s) of semantically related entity sets • Each entity set is an interpretation of the query • Mention do not overlap within a set obama mother the music man new york pizza manhattan { 􏰅{Barack Obama}􏰅} { {The Music Man} {The Music Man (1962 film)}, {The Music Man (2003 film)}􏰆 􏰅} { 􏰆 {New York City, Manhattan}, {New York-style pizza, Manhattan}􏰆 􏰅} D. Carmel, M.-W. Chang, E. Gabrilovich, B.J. P. Hsu, and K. Wang. ERD: Entity recognition and disambiguation challenge, 2014.
  10. 10 Tasks summary Entity Linking Semantic Mapping Interpretation Finding Entities explicitly mentioned? Yes No Yes Mentions can overlap? No Yes No* Results format Set Ranked list Sets of sets Evaluation criteria Mentioned entities found Relevant entities found Interpretations found Evaluation metrics Set-based Ranked-based Set-based * Not within the same interpretation ✓Entity linking requirements are relaxed in semantic mapping.
  11. 11 In this talk How entity linking should be performed for queries? ➤ Task: “Semantic Mapping” or “Interpretation Finding”? ➤ Evaluation metrics ➤ Test collections ➤ Methods
  12. 12 Evaluation • Macro-averaged metrics (precision, recall, F-measure) • Matching condition: – Interpretation sets should exactly match the ground truth System query interpretation Ground truth query interpretation ︸ What if or ? D. Carmel, M.-W. Chang, E. Gabrilovich, B.J. P. Hsu, and K. Wang. Entity recognition and disambiguation challenge, 2014. ☞ ︸
  13. 13 Evaluation (revisited) Solution: System output matches ground truth. System output does not match ground truth. This evaluation is methodologically correct, but strict. ☞
  14. 14 Lean evaluation • Partial matches are not rewarded in • E.g. {{New York City, Manhattan}} ≠ {{New York City}, {Manhattan}} Solution: Combine them with entity-based metrics.
  15. 15 In this talk How entity linking should be performed for queries? ➤ Task: “Semantic Mapping” or “Interpretation Finding”? ➤ Evaluation metrics ➤ Test collections ➤ Methods
  16. 16 Test collections - ERD The ERD challenge introduced two test collections: • ERD-dev (91 queries) • ERD-test (500 queries) – Unavailable for traditional offline evaluation Annotation rules: • The longest mention is used for entities • Only proper noun entities are annotated (e.g., companies, locations) • Overlapping mentions are not allowed within a single interpretation 1 http://web-ngram.research.microsoft.com/erd2014/Datasets.aspx ☞ ERD-dev is not suitable for training purposes (small) 1
  17. 17 Test collections - YSQLE Yahoo Search Query Log to Entities (YSQLE) • 2398 queries, manually annotated with Wikipedia entities • Designed for training and testing entity linking systems for queries Issues: • Not possible to automatically form interpretation sets – E.g. Query “france world cup 1998” • Linked entities are not necessarily mentioned explicitly – E.g. Query “charlie sheen lohan” is annotated with Anger Management (TV series) • Annotations are not always complete – E.g. Query “louisville courier journal” is not annotated with Louisville, Kentucky Yahoo! Webscope L24 dataset - Yahoo! search query log to entities, v1.0. http://webscope.sandbox.yahoo.com/ ☞ YSQLE is meant for the semantic mapping task
  18. 18 Test collections - Y-ERD Y-ERD is manually re-annotated based on: • YSQLE annotations • ERD rules Additional rules: • Site search queries are not linked – E.g. Query “facebook obama slur” is only linked to Barack Obama • Clear policy about misspelled mentions – Two versions of Y-ERD is made available ☞ Y-ERD is made publicly available http://bit.ly/ictir2015-elq
  19. 19 In this talk How entity linking should be performed for queries? ➤ Task: “Semantic Mapping” or “Interpretation Finding”? ➤ Evaluation metrics ➤ Test collections ➤ Methods
  20. 20 Candidate entity ranking Interpretation finding Mention detection ranked list of entities set of mentions interpretationsquery Methods Pipeline architecture for two tasks: Semantic mapping Interpretation Finding The goal of entity linking in queries
  21. 21 Mention detection Entity name variants are gathered from: • KB: A manually curated knowledge base (DBpedia) • WEB: Freebase Annotations of the ClueWeb Corpora (FACC) E. Gabrilovich, et. al. FACC1: Freebase annotation of ClueWeb corpora, 2013. Recall in mention detection step
  22. 22 Methods Pipeline architecture for two tasks: Semantic mapping Interpretation Finding The goal of entity linking in queries Candidate entity ranking Interpretation finding Mention detection ranked list of entities set of mentions interpretationsquery
  23. 23 Candidate entity ranking Ranking using language models: P(q) - Query length normalization - P. Ogilvie and J. Callan. Combining document representations for known-item search. In Proc. of SIGIR ’03, 2003 - W. Kraaij and M. Spitters. Language models for topic tracking. In Language Modeling for Information Retrieval, 2003. Scores should be comparable across queries – P(Q) should be considered Mixture of Language Models (MLM)
  24. 24 Candidate entity ranking Combining MLM and Commonness: Commonness Probability of entity e being the link target of mention m Query length normalized MLM score
  25. 25 Candidate entity ranking Semantic mapping results on YSQLE: TAGME is an entity linking system. • Should not be evaluated using rank-based metrics • Should not be compared with semantic mapping results P. Ferragina and U. Scaiella. TAGME: On-the-fly annotation of short text fragments. In Proc. of CIKM 2010.
  26. 26 Methods Pipeline architecture for two tasks: Semantic mapping Interpretation Finding The goal of entity linking in queries Candidate entity ranking Interpretation finding Mention detection ranked list of entities set of mentions interpretationsquery
  27. 27 Interpretation finding Greedy Interpretation Finding (GIF): Example query: “jacksonville fl riverside” Mention Entity Score “jacksonville fl” Jacksonville Florida 0.9 “jacksonville” Jacksonville Florida 0.8 “riverside” Riverside Park (Jacksonville) 0.6 “jacksonville fl” Naval Station Jacksonville 0.2 “riverside Riverside (band) 0.1 Step 1: Pruning based on a score threshold (0.3) Step 2: Pruning containment mentions Step 3: Forming interpretation sets { {Jacksonville Florida, Riverside Park (Jacksonville)} }
  28. 28 Interpretation finding ERD-dev Y-ERD
  29. 29 Take home messages • Entity linking in queries is different from documents • Different flavors, different evaluation criteria: – Interpretation finding (yes) – Semantic mapping (no) • Ultimate goal should be interpretation finding • SM and EL should not be compared to each other • Resources are available at http://bit.ly/ictir2015-elq
  30. 30 Thanks!

Editor's Notes

  1. Entity linking is a key enabling component for semantic search It is so important that it has its own Wikipedia article, referring to the task and the research that has been done in this area. This text is the definition of EL from Wikipedia. In facts this piece of text represent entity lining queit well, not because of the text, but because Of course this has been done by human, by the aim of entity linking is to make it automatically.
  2.  evaluate system's understanding of tasks users aim to achieve
  3. This is the beginning of the stories
  4. Both approaches have their place, but there is an important distinction to be made as they are designed to accomplish different tasks.
  5. is no difference made between system A that does not return anything and system B that returns meaningless or nonsense suggestions
  6. Mentions can be overlapping Entities do not need to form semantically compatible sets False positive are not penalized in semantic mapping
  7. according to this definition, if the query does not have any interpretations in the ground truth (Iˆ = ;) then precision is un- defined;
  8. We correct for this behavior by defining precision and recall for interpretation-based evaluation: It captures the extent to which the interpretations of the query are identified. Interpretation-based metrics:
  9. As with any problem in information retrieval, the availability of public datasets is of key importance.
  10. Terms that are meant to restrict the search to a certain site (such as Facebook or IMDB) should not be linked.
  11. to rank items (here: entities) based on query likelihood:
  12. It ranks entities based on the highest scoring mention, i.e., ranking is dependent not only on the query but on the specific mention as well:
  13. Though we include results for TAGME, we note that this comparison, despite having been done in prior work (e.g., in [6]), is an unfair one. TAGME is not a baseline for semantic mapping task
  14. Recall what Y-ERD and ERD-dev are.
  15. Semantic mapping and entity linking task should not be compared to each other. Next: Exploiting multiple query interpretations in entity ranking
  16. Some questions: Q1: Why multiple interpretations? ERD-dev is better representation of Web search queries Y-ERD is limited to annotations from YSQLE, and yes it has its own limitations Multiple interpretation is crucial; this is not just our idea, but also ERD organizer’s idea. Q2: Why use both KB and WEB for mention detection? Using KB is an offline process and it comes for free. All FACC entities are not used for entity ranking (it is not efficient). In this case, KB gives high quality name variants and improves recall. Q3: Are these evaluation metrics biased for a system that tend to return less entities? The naïve baseline for this datasets is f-measure 0.5 A system should be of course better than this baseline Refer to the results table, top-ranked results are low. Q4: Why interpretation finding? Closer to how human think. Part of understanding of ambiguity of queries.
Advertisement