A Knowledge Base
of Entity-Oriented Search Intents
Darío Garigliotti

February 2, 2018
A brief introduction to...
- Entities

- Entity types

- RDF tuples

- Knowledge bases

- Search intents
A brief introduction to...
- Entities

- Entity types

- RDF tuples

- Knowledge bases

- Search intents
A brief introduction to...
- Entities

- Entity types

- RDF tuples

- Knowledge bases

- Search intents
Recalling from the past
Recalling from the past
Entities
- An entity is an individual or thing, uniquely
identified

- For example:
Henrik Ibsen
Stavanger
- Also:
Pythagorean Theorem
UEFA Champions League
Entities
Entity types
- A typical property of an entity is the type(s)

- A entity type is a semantic class grouping
multiple entities

(Henrik Ibsen, is a, writer)
(Henrik Ibsen, is a, Norwegian writer)
(Henrik Ibsen, is a, person)
Entity types
Tuples
- We describe entity properties using triples 

- Attributes
(Henrik Ibsen, birthdate, 20 March 1828)
- Types
(Henrik Ibsen, is a, writer)
- Relations
(Henrik Ibsen, work, A Doll’s House)
- RDF (Resource Description Framework) 

- A way to represent structured knowledge
Knowledge bases
- A knowledge base (KB) is a set of tuples

- There are many knowledge bases

- Domain-specific, e.g. GeoNames, DOI, BBCMusic
- Cross-domain, e.g. DBpedia, YAGO, Freebase,
Google Knowledge Graph
- Yes, there are many
Cloud of (linked) KBs
Search intents and refiners
- Intent: the underlying user need in a search
query

- For example, the intent of booking a hotel room
- Entity-oriented queries

- Refiner: a way to express an intent in an entity-
oriented query

- For example, for booking a hotel room:
"booking", "book", "reservation", "rooms"
Towards an understanding
of search intents
- A large proportion of entity-oriented search
queries

- Interest in understanding what those queries ask
for, and how they can be fulfilled
A KB of entity-oriented
search intents
1. Intents searched for a type of entities

paris map, sydney map => [city] map
- a
2. Categories assigned to refiners

messi instagram => Website
scandic rooms => Service
henrik ibsen child => Property
- a
3. Multiple refiners expressing an intent

"booking", "book", "make a reservation", "rooms"
- a
A KB of entity-oriented
search intents
1. Intents searched for a type of entities

paris map, sydney map => [city] map
- (intent ID, searchedForType, entity type, conf.)
2. Categories assigned to refiners

messi instagram => Website
scandic rooms => Service
henrik ibsen child => Property
- a
3. Multiple refiners expressing an intent

"booking", "book", "make a reservation", "rooms"
- a
A KB of entity-oriented
search intents
1. Intents searched for a type of entities

paris map, sydney map => [city] map
- (intent ID, searchedForType, entity type, conf.)
2. Categories assigned to refiners

messi instagram => Website
scandic rooms => Service
henrik ibsen child => Property
- (intent ID, ofCategory, intent category, conf.)
3. Multiple refiners expressing an intent

"booking", "book", "make a reservation", "rooms"
- a
A KB of entity-oriented
search intents
1. Intents searched for a type of entities

paris map, sydney map => [city] map
- (intent ID, searchedForType, entity type, conf.)
2. Categories assigned to refiners

messi instagram => Website
scandic rooms => Service
henrik ibsen child => Property
- (intent ID, ofCategory, intent category, conf.)
3. Multiple refiners expressing an intent

"booking", "book", "make a reservation", "rooms"
- (intent ID, expressedBy, refiner, conf.)
Our pipeline approach
Our pipeline approach
Entities:

clarion hotel

casa 400

...
Our pipeline approach
clarion hotel
clarion hotel airport

clarion hotel spa

clarion hotel booking
casa 400
casa 400 rooms

casa 400 address

casa 400 deals
...
Our pipeline approach
clarion hotel
clarion hotel airport

clarion hotel spa

clarion hotel booking
casa 400
casa 400 rooms

casa 400 address

casa 400 deals
clarion hotel airport
casa 400 airport
scandic airport
...
...
Our pipeline approach
[hotel] airport

clarion hotel
clarion hotel airport

clarion hotel spa

clarion hotel booking
casa 400
casa 400 rooms

casa 400 address

casa 400 deals
clarion hotel airport
casa 400 airport
scandic airport
...
...
Our pipeline approach
[hotel] airport

[hotel] spa

[hotel] booking

...
clarion hotel
clarion hotel airport

clarion hotel spa

clarion hotel booking
casa 400
casa 400 rooms

casa 400 address

casa 400 deals
clarion hotel airport
casa 400 airport
scandic airport
...
...
Our pipeline approach
Refiners
acquisition
[hotel] airport

[hotel] spa

[hotel] booking

...
clarion hotel
clarion hotel airport

clarion hotel spa

clarion hotel booking
casa 400
casa 400 rooms

casa 400 address

casa 400 deals
clarion hotel airport
casa 400 airport
scandic airport
...
...
Our pipeline approach
Refiners
acquisition
[hotel] airport

[hotel] spa

[hotel] booking

...
Our pipeline approach
Refiners
acquisition
Refiners
categorization
[hotel] airport

[hotel] spa

[hotel] booking

...
Our pipeline approach
Refiners
acquisition
Refiners
categorization
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
Our pipeline approach
Refiners
acquisition
Refiners
categorization
Intents
discovery
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
Our pipeline approach
Refiners
acquisition
Refiners
categorization
Intents
discovery
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
taxi

arrive

Hotel_Arrivingbooking

make a reservation
Hotel_Booking
address
Hotel_Address
Our pipeline approach
Refiners
acquisition
Refiners
categorization
Intents
discovery
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
taxi

arrive

Hotel_Arrivingbooking

make a reservation
Hotel_Booking
address
Hotel_Address
KB
construction
Our pipeline approach
Refiners
acquisition
Refiners
categorization
Intents
discovery
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
taxi

arrive

Hotel_Arrivingbooking

make a reservation
Hotel_Booking
address
Hotel_Address
Intent ID Predicate Object Confidence
Hotel_Booking searchedForType [hotel] c1

Hotel_Booking ofCategory Service c2

Hotel_Booking expressedBy "booking" c3

Hotel_Booking expressedBy "make a reservation" c4
Hotel_Booking expressedBy "rooms" c5
KB
construction
Our pipeline approach
Refiners
acquisition
Refiners
categorization
Intents
discovery
[hotel] airport

[hotel] spa

[hotel] booking

...
[hotel] airport: Service

[hotel] address: Property

[hotel] expedia: Website

...
taxi

arrive

Hotel_Arrivingbooking

make a reservation
Hotel_Booking
address
Hotel_Address
Intent

profile
{ KB
construction
Intent ID Predicate Object Confidence
Hotel_Booking searchedForType [hotel] c1

Hotel_Booking ofCategory Service c2

Hotel_Booking expressedBy "booking" c3

Hotel_Booking expressedBy "make a reservation" c4
Hotel_Booking expressedBy "rooms" c5
Evaluation
- Component-level evaluation

- Cross-validation using the human annotations of intent
categories and refiner clusters, for a representative
sample of 50 types
- End-to-end evaluation

- Human judgments about KB facts, for a sample of
additional types, defined w.r.t. confidence intervals
Knowledge base
construction
- Application of the pipeline to extract all
quadruples from 581 unseen types

- 155K quadruples, 31K intent profiles

Excerpt of the KB, for intent ID
<aviation.airline-65-customer_service>
Results
[0, 0.8652)
[0.8652, 0.8837)
[0.8837, 0.9043)
[0.9043, 0.9319)
[0.9319, 1]
Confidence intervals according to the splitting percentiles
0%
20%
40%
60%
80%
100%
Proportionoftriples
Correct
Incorrect due to OFCATEGORY
Incorrect due to EXPRESSEDBY
Application scenarios
- Leveraging knowledge

with levels of confidence

- Identification of search

intents in unseen queries

- Design and functionality

of entity cards
Thank you!

A Knowledge Base of Entity-Oriented Search Intents

  • 1.
    A Knowledge Base ofEntity-Oriented Search Intents Darío Garigliotti February 2, 2018
  • 2.
    A brief introductionto... - Entities - Entity types - RDF tuples - Knowledge bases - Search intents
  • 3.
    A brief introductionto... - Entities - Entity types - RDF tuples - Knowledge bases - Search intents
  • 4.
    A brief introductionto... - Entities - Entity types - RDF tuples - Knowledge bases - Search intents
  • 5.
  • 6.
  • 7.
    Entities - An entityis an individual or thing, uniquely identified - For example: Henrik Ibsen Stavanger - Also: Pythagorean Theorem UEFA Champions League
  • 8.
  • 9.
    Entity types - Atypical property of an entity is the type(s) - A entity type is a semantic class grouping multiple entities (Henrik Ibsen, is a, writer) (Henrik Ibsen, is a, Norwegian writer) (Henrik Ibsen, is a, person)
  • 10.
  • 11.
    Tuples - We describeentity properties using triples - Attributes (Henrik Ibsen, birthdate, 20 March 1828) - Types (Henrik Ibsen, is a, writer) - Relations (Henrik Ibsen, work, A Doll’s House) - RDF (Resource Description Framework) - A way to represent structured knowledge
  • 12.
    Knowledge bases - Aknowledge base (KB) is a set of tuples - There are many knowledge bases - Domain-specific, e.g. GeoNames, DOI, BBCMusic - Cross-domain, e.g. DBpedia, YAGO, Freebase, Google Knowledge Graph - Yes, there are many
  • 13.
  • 14.
    Search intents andrefiners - Intent: the underlying user need in a search query - For example, the intent of booking a hotel room - Entity-oriented queries - Refiner: a way to express an intent in an entity- oriented query - For example, for booking a hotel room: "booking", "book", "reservation", "rooms"
  • 15.
    Towards an understanding ofsearch intents - A large proportion of entity-oriented search queries - Interest in understanding what those queries ask for, and how they can be fulfilled
  • 16.
    A KB ofentity-oriented search intents 1. Intents searched for a type of entities paris map, sydney map => [city] map - a 2. Categories assigned to refiners messi instagram => Website scandic rooms => Service henrik ibsen child => Property - a 3. Multiple refiners expressing an intent "booking", "book", "make a reservation", "rooms" - a
  • 17.
    A KB ofentity-oriented search intents 1. Intents searched for a type of entities paris map, sydney map => [city] map - (intent ID, searchedForType, entity type, conf.) 2. Categories assigned to refiners messi instagram => Website scandic rooms => Service henrik ibsen child => Property - a 3. Multiple refiners expressing an intent "booking", "book", "make a reservation", "rooms" - a
  • 18.
    A KB ofentity-oriented search intents 1. Intents searched for a type of entities paris map, sydney map => [city] map - (intent ID, searchedForType, entity type, conf.) 2. Categories assigned to refiners messi instagram => Website scandic rooms => Service henrik ibsen child => Property - (intent ID, ofCategory, intent category, conf.) 3. Multiple refiners expressing an intent "booking", "book", "make a reservation", "rooms" - a
  • 19.
    A KB ofentity-oriented search intents 1. Intents searched for a type of entities paris map, sydney map => [city] map - (intent ID, searchedForType, entity type, conf.) 2. Categories assigned to refiners messi instagram => Website scandic rooms => Service henrik ibsen child => Property - (intent ID, ofCategory, intent category, conf.) 3. Multiple refiners expressing an intent "booking", "book", "make a reservation", "rooms" - (intent ID, expressedBy, refiner, conf.)
  • 20.
  • 21.
  • 22.
    Our pipeline approach clarionhotel clarion hotel airport clarion hotel spa clarion hotel booking casa 400 casa 400 rooms casa 400 address casa 400 deals ...
  • 23.
    Our pipeline approach clarionhotel clarion hotel airport clarion hotel spa clarion hotel booking casa 400 casa 400 rooms casa 400 address casa 400 deals clarion hotel airport casa 400 airport scandic airport ... ...
  • 24.
    Our pipeline approach [hotel]airport clarion hotel clarion hotel airport clarion hotel spa clarion hotel booking casa 400 casa 400 rooms casa 400 address casa 400 deals clarion hotel airport casa 400 airport scandic airport ... ...
  • 25.
    Our pipeline approach [hotel]airport [hotel] spa [hotel] booking ... clarion hotel clarion hotel airport clarion hotel spa clarion hotel booking casa 400 casa 400 rooms casa 400 address casa 400 deals clarion hotel airport casa 400 airport scandic airport ... ...
  • 26.
    Our pipeline approach Refiners acquisition [hotel]airport [hotel] spa [hotel] booking ... clarion hotel clarion hotel airport clarion hotel spa clarion hotel booking casa 400 casa 400 rooms casa 400 address casa 400 deals clarion hotel airport casa 400 airport scandic airport ... ...
  • 27.
    Our pipeline approach Refiners acquisition [hotel]airport [hotel] spa [hotel] booking ...
  • 28.
  • 29.
    Our pipeline approach Refiners acquisition Refiners categorization [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ...
  • 30.
    Our pipeline approach Refiners acquisition Refiners categorization Intents discovery [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ...
  • 31.
    Our pipeline approach Refiners acquisition Refiners categorization Intents discovery [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ... taxi arrive Hotel_Arrivingbooking make a reservation Hotel_Booking address Hotel_Address
  • 32.
    Our pipeline approach Refiners acquisition Refiners categorization Intents discovery [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ... taxi arrive Hotel_Arrivingbooking make a reservation Hotel_Booking address Hotel_Address KB construction
  • 33.
    Our pipeline approach Refiners acquisition Refiners categorization Intents discovery [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ... taxi arrive Hotel_Arrivingbooking make a reservation Hotel_Booking address Hotel_Address Intent ID Predicate Object Confidence Hotel_Booking searchedForType [hotel] c1 Hotel_Booking ofCategory Service c2 Hotel_Booking expressedBy "booking" c3 Hotel_Booking expressedBy "make a reservation" c4 Hotel_Booking expressedBy "rooms" c5 KB construction
  • 34.
    Our pipeline approach Refiners acquisition Refiners categorization Intents discovery [hotel]airport [hotel] spa [hotel] booking ... [hotel] airport: Service [hotel] address: Property [hotel] expedia: Website ... taxi arrive Hotel_Arrivingbooking make a reservation Hotel_Booking address Hotel_Address Intent profile { KB construction Intent ID Predicate Object Confidence Hotel_Booking searchedForType [hotel] c1 Hotel_Booking ofCategory Service c2 Hotel_Booking expressedBy "booking" c3 Hotel_Booking expressedBy "make a reservation" c4 Hotel_Booking expressedBy "rooms" c5
  • 35.
    Evaluation - Component-level evaluation -Cross-validation using the human annotations of intent categories and refiner clusters, for a representative sample of 50 types - End-to-end evaluation - Human judgments about KB facts, for a sample of additional types, defined w.r.t. confidence intervals
  • 36.
    Knowledge base construction - Applicationof the pipeline to extract all quadruples from 581 unseen types - 155K quadruples, 31K intent profiles Excerpt of the KB, for intent ID <aviation.airline-65-customer_service>
  • 37.
    Results [0, 0.8652) [0.8652, 0.8837) [0.8837,0.9043) [0.9043, 0.9319) [0.9319, 1] Confidence intervals according to the splitting percentiles 0% 20% 40% 60% 80% 100% Proportionoftriples Correct Incorrect due to OFCATEGORY Incorrect due to EXPRESSEDBY
  • 38.
    Application scenarios - Leveragingknowledge with levels of confidence - Identification of search intents in unseen queries - Design and functionality of entity cards
  • 41.