Date: October 7, 2016
Venue: Stavanger, Norway. Technical talk at UiS TN-IDE
Please cite, link to or credit this presentation when using it or part of it in your work.
1. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type Information in Entity Retrieval
Dar´ıo Garigliotti
University of Stavanger
October 7th, 2016
Dar´ıo Garigliotti Type Information in Entity Retrieval
2. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Outline:
1 Entities, Properties, and Knowledge Bases
2 Types and Entity Retrieval
3 Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Dar´ıo Garigliotti Type Information in Entity Retrieval
3. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Entity Retrieval - An example: Henrik Ibsen
Dar´ıo Garigliotti Type Information in Entity Retrieval
4. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
An example: Henrik Ibsen (in Wikipedia)
Dar´ıo Garigliotti Type Information in Entity Retrieval
5. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Entities and properties
An entity is an individual or thing, uniquely identified
We describe its properties using triples
Attributes
Henrik Ibsen, birthdate, 20 March 1828
Types
Henrik Ibsen, is a, writer
Relations
Henrik Ibsen, child, Sigurd Ibsen
Henrik Ibsen, work, A Doll’s House
Dar´ıo Garigliotti Type Information in Entity Retrieval
6. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
RDF and knowledge bases
RDF (Resource Description Framework)
A family of specifications to describe Web resources
A way to represent structured knowledge
Dar´ıo Garigliotti Type Information in Entity Retrieval
7. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
RDF and knowledge bases
RDF (Resource Description Framework)
A family of specifications to describe Web resources
A way to represent structured knowledge
A knowledge base is a set of triples
For example, our entity Henrik Ibsen in DBpedia
Dar´ıo Garigliotti Type Information in Entity Retrieval
8. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Henrik Ibsen in DBpedia
Dar´ıo Garigliotti Type Information in Entity Retrieval
9. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Henrik Ibsen in DBpedia (continued)
Dar´ıo Garigliotti Type Information in Entity Retrieval
10. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
RDF and knowledge bases
RDF (Resource Description Framework)
A way to represent structured knowledge
A knowledge base is a set of triples
There are many knowledge bases
Domain-specific, e.g. GeoNames, DOI, BBCMusic
Cross-domain, e.g. DBpedia, YAGO, Freebase, Google
Knowledge Graph
Dar´ıo Garigliotti Type Information in Entity Retrieval
11. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Knowledge bases as knowledge graphs
Dar´ıo Garigliotti Type Information in Entity Retrieval
12. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Knowledge bases as knowledge graphs
Dar´ıo Garigliotti Type Information in Entity Retrieval
13. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
RDF and knowledge bases
RDF (Resource Description Framework)
A way to represent structured knowledge
A knowledge base is a set of triples
There are many knowledge bases
They are interconnected as Linked Open Data
Dar´ıo Garigliotti Type Information in Entity Retrieval
14. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Linked Open Data
Dar´ıo Garigliotti Type Information in Entity Retrieval
15. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Linked Open Data
Dar´ıo Garigliotti Type Information in Entity Retrieval
16. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Entity types
A typical property of an entity is the type(s)
Henrik Ibsen, is a, writer
Henrik Ibsen, is a, Norwegian writer
Henrik Ibsen, is a, person
Dar´ıo Garigliotti Type Information in Entity Retrieval
17. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Henrik Ibsen types in DBpedia
Dar´ıo Garigliotti Type Information in Entity Retrieval
18. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Henrik Ibsen types in Wikipedia
Dar´ıo Garigliotti Type Information in Entity Retrieval
19. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Entity types
A typical property of an entity is the type(s)
Types are organized in hierarchies (or taxonomies, or
ontologies)
Dar´ıo Garigliotti Type Information in Entity Retrieval
20. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. DBpedia Ontology
Dar´ıo Garigliotti Type Information in Entity Retrieval
21. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Entity types
A typical property of an entity is the type(s)
Types are organized in hierarchies (or taxonomies, or
ontologies)
Types are grouping similar information
They help to reduce the space of search
Dar´ıo Garigliotti Type Information in Entity Retrieval
22. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Buying a book on Amazon
Dar´ıo Garigliotti Type Information in Entity Retrieval
23. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Buying a book on Amazon
Dar´ıo Garigliotti Type Information in Entity Retrieval
24. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Buying a book on Amazon
Dar´ıo Garigliotti Type Information in Entity Retrieval
25. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
E.g. Buying a book on Amazon
Dar´ıo Garigliotti Type Information in Entity Retrieval
26. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type information in Entity Retrieval
Types are useful for entity retrieval
They naturally appear in many queries
countries where one can pay with the euro
art museums in Amsterdam
Queries could (somehow) have assigned types
Dar´ıo Garigliotti Type Information in Entity Retrieval
27. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Query types
Dar´ıo Garigliotti Type Information in Entity Retrieval
28. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Query types
Dar´ıo Garigliotti Type Information in Entity Retrieval
29. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Dimensions of type information
We analyse 3 dimensions
Type taxonomies
Type representations
Retrieval models
Dar´ıo Garigliotti Type Information in Entity Retrieval
30. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Type taxonomies
Which type taxonomy to use?
DBpedia Ontology (7 levels, 600 types)
Freebase Types (2 levels, 2K types)
Wikipedia Categories (34 levels, 600K types)
YAGO Taxonomy (19 levels, 500K types)
These vary a lot in terms of hierarchical structure and in how
entity-type assignments are recorded
Dar´ıo Garigliotti Type Information in Entity Retrieval
31. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
Dar´ıo Garigliotti Type Information in Entity Retrieval
32. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
Dar´ıo Garigliotti Type Information in Entity Retrieval
33. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
Dar´ıo Garigliotti Type Information in Entity Retrieval
34. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Most specific type(s)
Dar´ıo Garigliotti Type Information in Entity Retrieval
35. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Retrieval models
How to add type information into entity retrieval?
Retrieval task
defined in a
generative
probabilistic
framework
P(q | e)
query entity
Olympic games
target types
Rio de Janeiro
term-based
similarity
type-based
similarity
… …
entity types
Both query and entity considered in the term space as well as
in the type space
Dar´ıo Garigliotti Type Information in Entity Retrieval
36. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Retrieval models
(Strict) Filtering model
P(q | e) = P(θT
q | θT
e ) · χ[types(q) ∩ types(e) = ∅]
Types(q)Types(q) Types(e)Types(e)
Dar´ıo Garigliotti Type Information in Entity Retrieval
37. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Retrieval models
(Soft) Filtering model
P(q | e) = P(θT
q | θT
e ) · P(θT
q | θT
e )
Dar´ıo Garigliotti Type Information in Entity Retrieval
38. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Retrieval models
Interpolation model
P(q | e) = (1 − λ) · P(θT
q | θT
e ) + λ · P(θT
q | θT
e )
Dar´ıo Garigliotti Type Information in Entity Retrieval
39. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
What did we do?
We systematically identified and compared all combinations of
those dimensions
4 type taxonomies: DBpedia Ontology (3.9), Freebase
Types (2015-03-31), Wikipedia Categories (for DBpedia
3.9) and YAGO Taxonomy (3.0.2)
3 type representations: path-to-top, top-level, most
specific
3 models: strict and soft filtering, interpolation
Environment: from idealized to realistic
entities fully typed in all the taxonomies
Dar´ıo Garigliotti Type Information in Entity Retrieval
40. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
What did we do? Results
Dar´ıo Garigliotti Type Information in Entity Retrieval
41. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Lessons learned
Summary of insights: Type information proves most useful
when larger, deeper type taxonomies provide very specific
types.
How to represent hierarchical entity type information?
Using the most specific types is the most effective way
What (kind of) type taxonomies to use? Wikipedia
performs best in most of the cases
What combination model to choose? All models suffer
from missing type information, but interpolation appears
to be the most robust
Dar´ıo Garigliotti Type Information in Entity Retrieval
42. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Type taxonomies
Type representations
Retrieval models
Future work
Identify the queries suitable for type-aware entity retrieval
Move the environment: from idealized to realistic
We used a query types oracle
Then, to have an automatic query target type detection
Dar´ıo Garigliotti Type Information in Entity Retrieval
43. Entities, Properties, and Knowledge Bases
Types and Entity Retrieval
Dimensions of Type Information
Thanks!
Questions?
Dar´ıo Garigliotti Type Information in Entity Retrieval