On Type-Aware Entity Retrieval
Dar´ıo Garigliotti and Krisztian Balog
University of Stavanger
3rd ACM International Conference
on the Theory of Information Retrieval
Amsterdam, The Netherlands - October 2, 2017
We thank SIGIR for the Students Travel Grant
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Outline:
1 Type-Aware Entity Retrieval
2 Dimensions of Type Information
3 Results and Analysis
4 Conclusions and Future Work
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Entity Types
Target Types
Entity Types
A characteristic property of entities is that they are typed
Types are organized in hierarchies (or taxonomies)
…
Scientist
… ……
Person
Agent …
Enrico
Fermi
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Entity Types
Target Types
Query Target Types
Target types: types of entities sought by the query
…
ScientistArtist Writer
… ……
Person
Agent …
italian nobel prize winners
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Entity Types
Target Types
Target Types
Target types occur in many queries
countries where one can pay with the euro
art museums in Amsterdam
italian nobel prize winners
Types help to reduce the space of search
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Entity Types
Target Types
E.g. Buying a book
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimensions of Type Information
Type information have been shown to improve Entity Retrieval
INEX Entity Ranking track
We systematically identify and compare all combinations of
three dimensions of type information
Type taxonomies
Type representations
Retrieval models
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Type Taxonomies
Which type taxonomy to use?
DBpedia Ontology (7 levels, 600 types)
Freebase Types (2 levels, 2K types)
Wikipedia Categories (34 levels, 600K types)
YAGO Taxonomy (19 levels, 500K types)
These vary a lot in terms of hierarchical structure and in how
entity-type assignments are recorded
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Type Representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Type Representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Type Representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Most specific type(s)
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Retrieval Models
How to add type information into entity retrieval?
Retrieval task
defined in a
generative
probabilistic
framework
P(q | e)
query entity
Olympic games
target types
Rio de Janeiro
term-based
similarity
type-based
similarity
… …
entity types
Both query and entity considered in the term space as well as
in the type space
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Retrieval Models
(Strict) Filtering model
P(q | e) = P(θT
q | θT
e ) · χ[types(q) ∩ types(e) = ∅]
Types(q)Types(q) Types(e)Types(e)
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Retrieval Models
(Soft) Filtering model
P(q | e) = P(θT
q | θT
e ) · P(θT
q | θT
e )
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Type Taxonomies
Type Representations
Retrieval Models
Dimension: Retrieval Models
Interpolation model
P(q | e) = (1 − λ) · P(θT
q | θT
e ) + λ · P(θT
q | θT
e )
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Experimental Setup
Test collection of DBpedia entities 1
Baseline: Mixture of Language Models (title and content)
Idealized assumption of a target types oracle
Settings for type assignments
1
Krisztian Balog and Robert Neumayer. 2013. A Test Collection for
Entity Search in DBpedia. In Proc. of SIGIR. 737–740.
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Experimental Setup: Target Types Oracle
An oracle provides us with the (distribution of) correct
target types for a given query
Construction: given a query, take union of all types of all its
relevant entities
Probability proportional to the number of relevant entities
having the type
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Experimental Setup: Type Assignments
Two settings to deal with missing type assignments
4TT: Only entities with types in all types taxonomies
E.g. types for the entity Enrico Fermi
In DBpedia: Scientist
In Freebase: award.award winner,
people.deceased person, education.academic, ...
In Wikipedia: Nobel laureates in Physics,
University of Pisa alumni, ...
In YAGO: ItalianPhysicists,
NobelLaureatesInPhysics,
AmericanPeopleOfItalianDescent, ...
ALL: All available entities are allowed
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Research Questions
RQ1 What is the impact of the particular choice of type
taxonomy on entity retrieval performance?
RQ2 How to represent hierarchical entity type information
for entity retrieval?
RQ3 How to combine term-based and type-based
information?
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Results
Wikipedia, in combination with the most specific type
representation, performs best (for both 4TT and ALL)
Highly significant improvements for all models in 4TT
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Results
RQ1 What is the impact of the particular choice of type
taxonomy on entity retrieval performance?
Wikipedia, in combination with the most specific type
representation, performs best (for both 4TT and ALL)
Highly significant improvements for all models in 4TT
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Results
RQ2 How to represent hierarchical entity type information for
entity retrieval?
Using the most specific types in the hierarchy provides the
best performance
No evidence that hierarchical relationships from ancestor
types would benefit retrieval effectiveness
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Results
RQ3 How to combine term-based and type-based information?
In the 4TT setting, strict filtering is the best retrieval model
Only the interpolation model can deal in a robust manner
with the loss of type assignments in the ALL setting
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Results
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
Summary of Findings
Using the most specific types is the most effective way to
represent hierarchical entity type information
Wikipedia performs best across all type taxonomies in
most of the cases
All models to combine term- and type-based information
suffer from missing type information, but interpolation
appears to be the most robust
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
An Instance of Query-level Analysis
Query: italian nobel prize winners
Baseline. MAP: 0.1607
Target types:
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
An Instance of Query-level Analysis
Query: italian nobel prize winners
Baseline. MAP: 0.1607
Target types:
DBpedia, most specific,
soft filter. MAP: 0.1829
Artist,
Scientist,
Writer.
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
An Instance of Query-level Analysis
Query: italian nobel prize winners
Baseline. MAP: 0.1607
Target types:
DBpedia, most specific,
soft filter. MAP: 0.1829
Artist,
Scientist,
Writer.
Wikipedia, most specific,
inter (0.95). MAP: 0.8518
Italian Nobel
laureates,
Nobel laureates in
Literature, ...
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Experimental Setup
Research Questions
Results and Analysis
What is in a Target Type?
What portion of relevant entities can target types capture?
Top-K types Type Taxonomy P R F1
K = 1 DBpedia 0.0027 0.5863 0.0046
Freebase 0.0060 0.7254 0.0076
Wikipedia 0.1147 0.4798 0.1287
YAGO 0.0418 0.6303 0.0488
K = 3 DBpedia 0.0006 0.7199 0.0012
Freebase 0.0004 0.7805 0.0008
Wikipedia 0.0402 0.5847 0.0614
YAGO 0.0036 0.7025 0.0062
Fine-grained types in Wikipedia category graph can capture
some subset of relevant entities with the highest P and F1
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Conclusions and Future Work
In this work:
We identify and systematically compare distinguished
dimensions in type-aware entity retrieval
We observe that type information proves most useful when
larger, deeper type taxonomies provide very specific types.
In future work:
We plan to report further query-level analyses
We wish to re-assess the experiments using automatically
identified target types2
2
Dar´ıo Garigliotti, Faegheh Hasibi, and Krisztian Balog. 2017. Target
Type Identification for Entity-Bearing Queries. In Proc. of SIGIR.
845–848.
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Appendices
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Appendix: Retrieval Model
Interpolation model
For DBpedia and Freebase, more type-based information
is always increasingly more harmful
Wikipedia and YAGO performances increase with higher
contribution of type information using most specific types.
0 0.5 10
0.1
0.2
0.3
0.4
λt
MAP
DBpedia Freebase Wikipedia YAGO
(a) Along path
Figure 1: Interpolation performances for different type weights λt (4TT).
0 0.5 10
0.1
0.2
0.3
0.4
λt
MAP
(a) Path-to-top types
0 0.5 1
λt
(b) Top-level types
0 0.5 1
λt
(c) Most specific types
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Appendix: Revisited Target Types Oracle
The target types distribution of the default oracle includes
all types associated with known relevant entities
Alternatively, we assess the configurations using a filtered
oracle of target types that satisfy a threshold of coverage of
relevant entities
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
Type-Aware Entity Retrieval
Dimensions of Type Information
Results and Analysis
Appendix: Revisited Target Types Oracle
Target Types Oracles: Default Filtered Models: Strict filtering Soft filtering Interpolation
0
0.1
0.2
0.3
Configurations
MAP
0
0.1
0.2
0.3
DBpedia Freebase Wikipedia YAGO
MAP
(a) Path-to-top types
0
0.1
0.2
0.3
DBpedia Freebase Wikipedia YAGO
MAP
(b) Top-level types
0
0.1
0.2
0.3
DBpedia Freebase Wikipedia YAGO
MAP
(c) Most specific types
Filtered oracle leads to considerable drops in performance
of settings using the most specific types
It is important to consider all possible target types
Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval

On Type-Aware Entity Retrieval

  • 1.
    On Type-Aware EntityRetrieval Dar´ıo Garigliotti and Krisztian Balog University of Stavanger 3rd ACM International Conference on the Theory of Information Retrieval Amsterdam, The Netherlands - October 2, 2017
  • 2.
    We thank SIGIRfor the Students Travel Grant
  • 3.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Outline: 1 Type-Aware Entity Retrieval 2 Dimensions of Type Information 3 Results and Analysis 4 Conclusions and Future Work Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 4.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Entity Types Target Types Entity Types A characteristic property of entities is that they are typed Types are organized in hierarchies (or taxonomies) … Scientist … …… Person Agent … Enrico Fermi Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 5.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Entity Types Target Types Query Target Types Target types: types of entities sought by the query … ScientistArtist Writer … …… Person Agent … italian nobel prize winners Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 6.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Entity Types Target Types Target Types Target types occur in many queries countries where one can pay with the euro art museums in Amsterdam italian nobel prize winners Types help to reduce the space of search Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 7.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Entity Types Target Types E.g. Buying a book Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 8.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimensions of Type Information Type information have been shown to improve Entity Retrieval INEX Entity Ranking track We systematically identify and compare all combinations of three dimensions of type information Type taxonomies Type representations Retrieval models Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 9.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Type Taxonomies Which type taxonomy to use? DBpedia Ontology (7 levels, 600 types) Freebase Types (2 levels, 2K types) Wikipedia Categories (34 levels, 600K types) YAGO Taxonomy (19 levels, 500K types) These vary a lot in terms of hierarchical structure and in how entity-type assignments are recorded Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 10.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Type Representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 11.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Type Representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Top-level type(s) Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 12.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Type Representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Top-level type(s) t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Most specific type(s) Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 13.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Retrieval Models How to add type information into entity retrieval? Retrieval task defined in a generative probabilistic framework P(q | e) query entity Olympic games target types Rio de Janeiro term-based similarity type-based similarity … … entity types Both query and entity considered in the term space as well as in the type space Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 14.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Retrieval Models (Strict) Filtering model P(q | e) = P(θT q | θT e ) · χ[types(q) ∩ types(e) = ∅] Types(q)Types(q) Types(e)Types(e) Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 15.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Retrieval Models (Soft) Filtering model P(q | e) = P(θT q | θT e ) · P(θT q | θT e ) Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 16.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Type Taxonomies Type Representations Retrieval Models Dimension: Retrieval Models Interpolation model P(q | e) = (1 − λ) · P(θT q | θT e ) + λ · P(θT q | θT e ) Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 17.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Experimental Setup Test collection of DBpedia entities 1 Baseline: Mixture of Language Models (title and content) Idealized assumption of a target types oracle Settings for type assignments 1 Krisztian Balog and Robert Neumayer. 2013. A Test Collection for Entity Search in DBpedia. In Proc. of SIGIR. 737–740. Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 18.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Experimental Setup: Target Types Oracle An oracle provides us with the (distribution of) correct target types for a given query Construction: given a query, take union of all types of all its relevant entities Probability proportional to the number of relevant entities having the type Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 19.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Experimental Setup: Type Assignments Two settings to deal with missing type assignments 4TT: Only entities with types in all types taxonomies E.g. types for the entity Enrico Fermi In DBpedia: Scientist In Freebase: award.award winner, people.deceased person, education.academic, ... In Wikipedia: Nobel laureates in Physics, University of Pisa alumni, ... In YAGO: ItalianPhysicists, NobelLaureatesInPhysics, AmericanPeopleOfItalianDescent, ... ALL: All available entities are allowed Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 20.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Research Questions RQ1 What is the impact of the particular choice of type taxonomy on entity retrieval performance? RQ2 How to represent hierarchical entity type information for entity retrieval? RQ3 How to combine term-based and type-based information? Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 21.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Results Wikipedia, in combination with the most specific type representation, performs best (for both 4TT and ALL) Highly significant improvements for all models in 4TT Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 22.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Results RQ1 What is the impact of the particular choice of type taxonomy on entity retrieval performance? Wikipedia, in combination with the most specific type representation, performs best (for both 4TT and ALL) Highly significant improvements for all models in 4TT Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 23.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Results RQ2 How to represent hierarchical entity type information for entity retrieval? Using the most specific types in the hierarchy provides the best performance No evidence that hierarchical relationships from ancestor types would benefit retrieval effectiveness Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 24.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Results RQ3 How to combine term-based and type-based information? In the 4TT setting, strict filtering is the best retrieval model Only the interpolation model can deal in a robust manner with the loss of type assignments in the ALL setting Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 25.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Results Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 26.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis Summary of Findings Using the most specific types is the most effective way to represent hierarchical entity type information Wikipedia performs best across all type taxonomies in most of the cases All models to combine term- and type-based information suffer from missing type information, but interpolation appears to be the most robust Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 27.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis An Instance of Query-level Analysis Query: italian nobel prize winners Baseline. MAP: 0.1607 Target types: Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 28.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis An Instance of Query-level Analysis Query: italian nobel prize winners Baseline. MAP: 0.1607 Target types: DBpedia, most specific, soft filter. MAP: 0.1829 Artist, Scientist, Writer. Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 29.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis An Instance of Query-level Analysis Query: italian nobel prize winners Baseline. MAP: 0.1607 Target types: DBpedia, most specific, soft filter. MAP: 0.1829 Artist, Scientist, Writer. Wikipedia, most specific, inter (0.95). MAP: 0.8518 Italian Nobel laureates, Nobel laureates in Literature, ... Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 30.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Experimental Setup Research Questions Results and Analysis What is in a Target Type? What portion of relevant entities can target types capture? Top-K types Type Taxonomy P R F1 K = 1 DBpedia 0.0027 0.5863 0.0046 Freebase 0.0060 0.7254 0.0076 Wikipedia 0.1147 0.4798 0.1287 YAGO 0.0418 0.6303 0.0488 K = 3 DBpedia 0.0006 0.7199 0.0012 Freebase 0.0004 0.7805 0.0008 Wikipedia 0.0402 0.5847 0.0614 YAGO 0.0036 0.7025 0.0062 Fine-grained types in Wikipedia category graph can capture some subset of relevant entities with the highest P and F1 Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 31.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Conclusions and Future Work In this work: We identify and systematically compare distinguished dimensions in type-aware entity retrieval We observe that type information proves most useful when larger, deeper type taxonomies provide very specific types. In future work: We plan to report further query-level analyses We wish to re-assess the experiments using automatically identified target types2 2 Dar´ıo Garigliotti, Faegheh Hasibi, and Krisztian Balog. 2017. Target Type Identification for Entity-Bearing Queries. In Proc. of SIGIR. 845–848. Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 32.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 33.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 34.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Appendices Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 35.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Appendix: Retrieval Model Interpolation model For DBpedia and Freebase, more type-based information is always increasingly more harmful Wikipedia and YAGO performances increase with higher contribution of type information using most specific types. 0 0.5 10 0.1 0.2 0.3 0.4 λt MAP DBpedia Freebase Wikipedia YAGO (a) Along path Figure 1: Interpolation performances for different type weights λt (4TT). 0 0.5 10 0.1 0.2 0.3 0.4 λt MAP (a) Path-to-top types 0 0.5 1 λt (b) Top-level types 0 0.5 1 λt (c) Most specific types Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 36.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Appendix: Revisited Target Types Oracle The target types distribution of the default oracle includes all types associated with known relevant entities Alternatively, we assess the configurations using a filtered oracle of target types that satisfy a threshold of coverage of relevant entities Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval
  • 37.
    Type-Aware Entity Retrieval Dimensionsof Type Information Results and Analysis Appendix: Revisited Target Types Oracle Target Types Oracles: Default Filtered Models: Strict filtering Soft filtering Interpolation 0 0.1 0.2 0.3 Configurations MAP 0 0.1 0.2 0.3 DBpedia Freebase Wikipedia YAGO MAP (a) Path-to-top types 0 0.1 0.2 0.3 DBpedia Freebase Wikipedia YAGO MAP (b) Top-level types 0 0.1 0.2 0.3 DBpedia Freebase Wikipedia YAGO MAP (c) Most specific types Filtered oracle leads to considerable drops in performance of settings using the most specific types It is important to consider all possible target types Dar´ıo Garigliotti and Krisztian Balog On Type-Aware Entity Retrieval