Using DBpedia for Spotting and Disambiguating Entities
1. Julien Plu, Giuseppe Rizzo, Raphaël Troncy
{firstname.lastname}@eurecom.fr,
@julienplu, @giusepperizzo, @rtroncy
Using DBpedia for Spotting and
Disambiguating Entities
2. Agenda
Entity Linking task
Why using DBpedia?
Workflow
How is the index created?
Experiments on tweets
Future work
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 2
3. Entity Linking Task
The purpose is to link entity mentions one can
find in text to their corresponding entries in a
knowledge base.
Example:
Last year I went to Paris to see the Eiffel Tower with
some friends.
http://dbpedia.org/resource/Paris http://dbpedia.org/resource/Eiffel_Tower
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 3
4. Why using DBpedia?
No legacy problems compared with Freebase
Knowledge base is constantly evolving
Available in many languages which are
interlinked
Most of the resources have a type
All the resources have semantic relations with
others
Possibility to get the popularity of a resource
for each language
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 4
5. Workflow
Text
POS Tagging /
N-grams
analysis to get
the entities
Lookup in the
index to get
candidates for
each entities
linking each
entity in
choosing the
right one
among the
candidates
• Not domain-dependent
• The lookup and the linking processes are made on top of
an index created with DBpedia
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 5
6. How is the index created?
3 datasets are used:
Titles
Redirects
Disambiguation links
Structure of the index:
First column is the label of the entity
Second column is the URI of the entity
Third column list all the labels of the redirect pages
linked to the entity
Fourth column is the label of the disambiguation page of
the entity
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 6
8. Future work
Using deeply DBpedia:
Relation among the entities
Compute the popularity of an entity (i.e pageRank
according to a language)
Relation between different languages for the same entity
Using the types for each entity
Using better algorithm to rank candidates after
the lookup
09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 8