Call girls Service in Ajman 0505086370 Ajman call girls
PhD Day: Entity Linking using Generic Linked Data Datasets
1. Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Entity Linking Using Generic Linked
Data Datasets
PhD Day – April/2013
Bianca Pereira
2. Digital Enterprise Research Institute www.deri.ie
Agenda
Motivation
Problem
Related Work
Research Questions
Next Steps
Challenges
2 of XYZ
3. Digital Enterprise Research Institute www.deri.ie
Motivation
Biggest part of the content available on the web is
unstructured natural language text.
How to structure natural language texts in order to be
easier to process them?
3 of XYZ
4. Digital Enterprise Research Institute www.deri.ie
Motivation
There are three possible solutions to this problem:
Extract knowledge from text according to a given structure
(ontology population from text).
Extract knowledge from text without using a previous structure
(ontology learning from text).
Link mention from text with entities from a structured knowledge
base (entity linking).
4 of XYZ
5. Digital Enterprise Research Institute www.deri.ie
Motivation
Entity Linking..
.. enables reusing knowledge already published on the web.
.. can be used as the first step for ontology learning and
population algorithms.
5 of XYZ
6. Digital Enterprise Research Institute www.deri.ie
Motivation
Many datasets have been used for Entity Linking:
Relational datasets
Wikipedia
DBPedia, YAGO, MusicBrainz, Freebase, …
6 of XYZ
7. Digital Enterprise Research Institute www.deri.ie
Motivation
Linked Data datasets are promising because..
.. many of them are public.
.. they are already structured.
.. they are interlinked.
.. they are available under diverse ownership.
.. they provide knowledge in diverse domains.
.. the LOD cloud is growing.
7 of XYZ
8. Digital Enterprise Research Institute www.deri.ie
Motivation
There are already some Entity Linking solutions using
Linked Data datasets.
8 of XYZ
9. Digital Enterprise Research Institute www.deri.ie
Problem
Current Entity Linking Approaches work only with a small
fixed number of Linked Data datasets.
AIDA (YAGO)
Alchemy API (CIA Factbook, CrunchBase, Freebase,
GeoNames, MusicBrainz, OpenCyc, UMBEL, US Census,
YAGO)
DBPedia Spotlight (DBPedia)
Open Calais (Calais)
9 of XYZ
10. Digital Enterprise Research Institute www.deri.ie
Problem
Current tools work well with generic knowledge and
public datasets. But what do we do if we want to..
.. link an enterprise text with a private dataset?
.. identify domain specific entities?
10 of XYZ
11. Digital Enterprise Research Institute www.deri.ie
Problem
AELA (Adaptive Entity Linking Approach) was developed
to solve this problem..
11 of XYZ
12. Digital Enterprise Research Institute www.deri.ie
Problem
What AELA does not solve is..
.. the recognition of generalized entities/topics (such as genes
and diseases).
.. the recognition of individuals with the same name as their
classes (such as ambulance, coffee machine and airplane).
12 of XYZ
13. Digital Enterprise Research Institute www.deri.ie
Related Work
Which topics are related to Entity Linking?
Entity Resolution, coreference resolution, merge-purge, data
deduplication, object identification, mention matching, tuple
matching, record linkage, entity disambiguation, anaphora
resolution, instance identification, database hardening, entity
identification, identity resolution, reference reconciliation, record
matching, name matching, identity uncertainty, duplicate
detection, entity matching, instance matching, entity
consolidation, entity reconciliation, object consolidation, topic
consolidation, reference disambiguation, instance fusion, data
fusion.
13 of XYZ
15. Digital Enterprise Research Institute www.deri.ie
Research Questions
Which methods created in last 5 decades can be used to
improve AELA results?
How can AELA adapt itself to a given domain?
What are the use cases in which AELA can be applied?
Is it better than previous approaches?
May AELA be language independent?
15 of XYZ
16. Digital Enterprise Research Institute www.deri.ie
Research Questions
The most important question..
What is an entity!?
Object?
Concept?
Topic?
16 of XYZ
17. Digital Enterprise Research Institute www.deri.ie
Next Steps
Survey the methods used in related areas.
Evaluation of the methods within AELA architecture.
Develop a method to select a given Linked Data dataset
given the domain from text.
Apply AELA to news domain.
Evaluate AELA using datasets in other languages.
17 of XYZ
19. Digital Enterprise Research Institute www.deri.ie
Challenges
Many previous works
Big Data issues
Linked Data issues (standards and data quality)
Evaluation issues
19 of XYZ