SlideShare a Scribd company logo
1 of 30
Using BabelNet in Bridging the Gap
Between Natural Language Queries and
Linked Data Concepts
Khadija Elbedweihy, Stuart N. Wrigley, Fabio Ciravegna and and Ziqi Zhang
OAK Research Group,
Department of Computer Science,
University of Sheffield, UK
Outline
• Motivation and Problem Statement
• Natural Language Query Approach
• Approach Steps
• Evaluation
• Results and Discussion
Motivation – Semantic Search
• Wikipedia states that Semantic Search:
“seeks to improve search accuracy by understanding
searcher intent and the contextual meaning of terms as
they appear in the searchable dataspace, whether on the
Web or within a closed system, to generate more
relevant results”
• Semantic search evaluations reported user preference for
free natural language as a query approach (simple, fast &
flexible) as opposed to controlled or view-based inputs.
Problem Statement
• Complete freedom increases difficulty of matching query
terms with the underlying data and ontologies.
• Word sense disambiguation (WSD) is core to the solution.
Question: “How tall is ..... ?”: property height
– tall is polysemous, should be first disambiguated:
– great in vertical dimension; tall people; tall buildings, etc.
– too improbable to admit of belief; a tall story, …
• Another difficulty: Named Entity (NE) recognition and
disambiguation.
Approach
• Free-NL semantic search approach, matching user query
terms with the underlying ontology using:

1) An extended-Lesk WSD approach.
2) A NE recogniser.
3) A set of advanced string similarity algorithms and
ontology-based heuristics to match disambiguated
query terms to ontology concepts and properties.
Extended-Lesk WSD approach
• WordNet is predominant, however its granularity is a
problem for achieving high performance in WSD.

• BabelNet is a very large multilingual ontology with widecoverage obtained from both WordNet and Wikipedia.
• For disambiguation, bags are extended with senses’
glosses and different lexical and semantic relations.
• Include synonyms, hyponyms , hypernyms , attribute, see
also and similar to relations.
Extended-Lesk WSD approach
• Information added from a Wikipedia page (W), mapped
to a WordNet synset includes:
1.labels; page “Play (theatre)”  add play and theatre
2. set of pages redirecting to W; Playlet redirects to Play
3. set of pages linked from W; links in the page Play (theatre)
include literature, comedy, etc.

• Synonyms of synset S, associated with Wikipedia page W:
WordNet synonyms of S in addition to lemmas of
wikipedia information of W".
Extended-Lesk WSD approach
Feature

P

R

F1

Baseline
Synonyms
Syn + hypo
Syn + gloss examples (WN)
Syn + gloss examples (Wiki)
Syn + gloss examples (WN + Wiki)
Syn + hyper
Syn + semRel
Syn + hypo + gloss(WN)
Syn + hypo + gloss(WN) + hyper
Syn + hypo + gloss(WN) + hyper + semRel
Syn+hypo+gloss(WN)+hyper+semRel+relGlosses

58.09
59.14
62.16
61.97
61.14
60.21
60.36
59.65
64.92
65.28
65.45
69.76

57.98
59.03
62.07
61.86
61.02
60.10
60.26
59.54
64.81
65.18
65.33
69.66

58.03
59.09
62.12
61.92
61.08
60.16
60.31
59.59
64.86
65.23
65.39
69.71

• Sentences with less than seven words: f-measure of 81.34%
Approach – Steps
1. Recognition and disambiguation of Named Entities.
2. Parsing and Disambiguation of the NL query.
3. Matching query terms with ontology concepts and
properties.
4. Generation of candidate triples.
5. Integration of triples and generation of SPARQL queries.
1.Recognition and disambiguation of Named Entities
• Named entities recognised using AlchemyAPI.
• AlchemyAPI had the best recognition performance in
NERD evaluation of SOA NE recognizers.
• AlchemyAPI exhibits poor disambiguation performance
• Each NE is disambiguated using our BabelNet-based WSD
approach.
1.Recognition and disambiguation of Named Entities
• Example: “In which country does the Nile start?”
• Matches of Nile in BabelNet include:
–
–
–
–

http://dbpedia.org/resource/Nile (singer)
http://dbpedia.org/resource/Nile (TV series)
http://dbpedia.org/resource/Nile (band)
http://dbpedia.org/resource/Nile

• Match selected (Nile: river): overlapping terms between
sense and query (geography, area, culture, continent)
more than other senses.
2.Parsing and Disambiguation of the NL query
• Stanford Parser used to gather lemmas and POS tags.
• Proper nouns identified by the parser and not recognized
by AlchemyAPI are disambiguated and added to the
recognized entities.

• Example: “In which country does the Nile start?”
– The algorithm does not miss the entity Nile, although it
was not recognized by AlchemyAPI.
2.Parsing and Disambiguation of the NL query
• Example: “Which software has been developed by
organizations founded in California?”
Output:
Word
software

POS
NP

position
1

developed
organizations
founded

develop
organize
find

VBN
NNS
VBN

2
3
4

California

•

Lemma
software

California

NP

5

Equivalent output generated using keywords or phrases.
3.Matching Query Terms with Ontology Concepts & Properties
• Noun phrases, nouns and adjectives are matched with
concepts and properties.

• Verbs are matched only with properties.
• Candidate ontology matches ordered using: Jaro-Winkler
and Double Metaphone string similarity algorithms.
• Jaro-Winkler threshold to accept a match is set to 0.791,
shown in literature to be the best threshold value.
3.Matching Query Terms with Ontology Concepts & Properties
• Matching process uses the following in order:
1. query term (e.g., created)
2. lemma (e.g., create)
3. derivationally related forms (creator)

• If no matches, disambiguate query term and use
expansion terms in order:
1. synonyms
2. hyponyms
3. hypernyms
4. semantic relations (e.g., height as an attribute for tall)
4. Generation of Candidate Query Triples
• Structure of the ontology (taxonomy of classes and domain
and range of properties) used to link matched concepts and
properties and recognized entities to generate query triples.

Three-Terms Rule
• Each three consecutive terms matched with set of templates.

E.g., “Which television shows were created by Walt Disney?”
• Template (concept-property-instance) generates triples:
?television_show <dbo:creator> <res:Walt_Disney>
?television_show <dbp:creator> <res:Walt_Disney>
?television_show <dbo:creativeDirector> <res:Walt_Disney>
Three-Terms Rule
Examples of templates used in three-terms rule:
• concept-property-instance
– airports located in California
– actors born in Germany
• instance-property-instance
– Was Natalie Portman born in the United States?
• property-concept-instance
– birthdays of actors of television show Charmed
Two-Terms Rule
Two-Terms Rule, used when:
1) There is fewer than three derived terms
2) No match between query terms and three-term template
3) Matched template did not generate candidate triples
E.g., “In which films directed by Garry Marshall was Julia
Roberts starring?”
<Garry Marshall, Julia Roberts, starring> : matched to a
three-terms template but does not generate triples.
Two-Terms Rule
Two-Terms Rule
Question: “what is the area code of Berlin?”
• Template (property-instance) generates the triples:
<res:Berlin> <dbp:areaCode> ?area_code

<res:Berlin> <dbo:areaCode> ?area_code
Comparatives
Comparatives Scenarios:
1) Comparative used with a numeric datatype property:
e.g., “companies with more than 500,000 employees”
?company <dbp:numEmployees> ?employee
?company <dbp:numberOfEmployees> ?employee
?company a <dboCompany>
FILTER (?employee > 500000)
Comparatives
2) Comparative is used with a concept:
e.g., “places with more than 2 caves”

• Generate the same triples for places with caves:
?place a <http://dbpedia.org/ontology/Place>.
?cave a <http://dbpedia.org/ontology/Cave>.
?place ?rel1 ?cave.
?cave ?rel1 ?place.

• Add the aggregate restriction:
GROUP BY ?place
HAVING (COUNT(?cave)>2).
Comparatives
3) Comparative is used with an object property
e.g., “countries with more than 2 official languages”

• Similarly, generate the same triples for country and
official language and add the restriction:
GROUP BY ?country
HAVING (COUNT(?official_language) > 2)

4) Generic Comparatives
e.g., “Which mountains are higher than the Nanga Parbat?”
Generic Comparatives
• Difficulty: identify the property referred to by the
comparative term.

1) Select best relation according to query context.
– Identify all numeric datatype properties associated
with the concept “mountain”, include:
“latS, longD, prominence, firstAscent, elevation, longM, …”
2) Disambiguate synsets of all properties and use WSD
approach to identify the most related synset to the query.
– property elevation is correctly selected
5. Integration of Triples and Generation of SPARQL Queries
• Generated triples integrated to produce SPARQL query.
• Query term positions used to order the generated triples.
• Triples originating from the same query term are
executed in order until an answer is found.
• Duplicates are removed while merging the triples.
• SELECT and WHERE clauses added in addition to any
aggregate restrictions or solution modifiers.
Evaluation
• Test data from 2nd Open Challenge at QALD-2.
• Results produced by QALD-2 evaluation tool.

• Very promising results: 76% of questions answered correct.
Approach

Answered Correct

Precision

Recall

F1

BELA
QAKiS
Alexandria

31
35
25

17
11
5

0.62
0.39
0.43

0.73
0.37
0.46

0.67
0.38
0.45

SenseAware
SemSeK
MHE

54
80
97

41
32
30

0.51
0.44
0.36

0.53
0.48
0.4

0.52
0.46
0.38
Discussion
• Design choices affected by priority for precision or recall:
1. Query Relaxation
e.g., “Give me all actors starring in Last Action Hero”
– Restricting results to actors harms recall
– Not all entities in LD are typed, let alone correctly typed
– Query relaxation favors recall but affects precision
e.g. “How many films did Leonardo DiCaprio star in?”
– Return TV series rather than only films such as
res:Parenthood (1990 TV series).

• Decision: favor precision; keep restriction when specified.
Discussion
2. Best or All Matches
e.g., “software by organizations founded in California”
– Properties matched: foundation and foundationPlace
– Using only best match (foundation ) does not generate
all results  affects recall.
– Using all properties (may not be relevant to the query)
would harm precision.
• Decision: use all matches; with high value for the
similarity threshold; perform checks against the ontology
structure to assure relevant matches are only used.
Discussion
3. Query Expansion
• Can be useful for recall, when the query term is not
sufficient to return all answers.
• Example: use “website” and “homepage” if any of them
used in a query and both have matches in the ontology.
• Quality of expansion terms influenced by WSD approach;
wrong sense identification will lead to noisy list of terms.
• Decision: perform query expansion only when no
matches found in the ontology for a term; or no results
generated using the identified matches.
Questions

Questions?
Additional Slides

Additional Slides

More Related Content

What's hot

Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAlessandro Benedetti
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrLucidworks
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly ResourcesRobert Sanderson
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 

What's hot (12)

Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache Lucene
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
Annotating Scholarly Resources
Annotating Scholarly ResourcesAnnotating Scholarly Resources
Annotating Scholarly Resources
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 

Viewers also liked

Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...giuseppe_futia
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Heiko Paulheim
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesHeiko Paulheim
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataMuhammad Saleem
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsMarieke van Erp
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine LearningHeiko Paulheim
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataOlaf Hartig
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementJindřich Mynarz
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesLuiz Henrique Zambom Santana
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysisChristophe Guéret
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionYunyao Li
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?WiLS
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyTimm Heuss
 

Viewers also liked (20)

Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of Data
 
DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
NLP todo
NLP todoNLP todo
NLP todo
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
Applying Linked Open Data to Public Procurement
Applying Linked Open Data to Public ProcurementApplying Linked Open Data to Public Procurement
Applying Linked Open Data to Public Procurement
 
Exploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queriesExploiting the query structure for efficient join ordering in SPARQL queries
Exploiting the query structure for efficient join ordering in SPARQL queries
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysis
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity Detection
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?
 
Entity Search Engine
Entity Search Engine Entity Search Engine
Entity Search Engine
 
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
 

Similar to NLP & DBpedia

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ Prateek Jain
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchNoemi Derzsy
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectorsSimon Hughes
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Andre Freitas
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Webebiquity
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015RIILP
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Noemi Derzsy
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsPyData
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...Pierpaolo Basile
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedSören Auer
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 

Similar to NLP & DBpedia (20)

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from Scratch
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
UNIBA: Exploiting a Distributional Semantic Model for Disambiguating and Link...
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 

NLP & DBpedia

  • 1. Using BabelNet in Bridging the Gap Between Natural Language Queries and Linked Data Concepts Khadija Elbedweihy, Stuart N. Wrigley, Fabio Ciravegna and and Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield, UK
  • 2. Outline • Motivation and Problem Statement • Natural Language Query Approach • Approach Steps • Evaluation • Results and Discussion
  • 3. Motivation – Semantic Search • Wikipedia states that Semantic Search: “seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results” • Semantic search evaluations reported user preference for free natural language as a query approach (simple, fast & flexible) as opposed to controlled or view-based inputs.
  • 4. Problem Statement • Complete freedom increases difficulty of matching query terms with the underlying data and ontologies. • Word sense disambiguation (WSD) is core to the solution. Question: “How tall is ..... ?”: property height – tall is polysemous, should be first disambiguated: – great in vertical dimension; tall people; tall buildings, etc. – too improbable to admit of belief; a tall story, … • Another difficulty: Named Entity (NE) recognition and disambiguation.
  • 5. Approach • Free-NL semantic search approach, matching user query terms with the underlying ontology using: 1) An extended-Lesk WSD approach. 2) A NE recogniser. 3) A set of advanced string similarity algorithms and ontology-based heuristics to match disambiguated query terms to ontology concepts and properties.
  • 6. Extended-Lesk WSD approach • WordNet is predominant, however its granularity is a problem for achieving high performance in WSD. • BabelNet is a very large multilingual ontology with widecoverage obtained from both WordNet and Wikipedia. • For disambiguation, bags are extended with senses’ glosses and different lexical and semantic relations. • Include synonyms, hyponyms , hypernyms , attribute, see also and similar to relations.
  • 7. Extended-Lesk WSD approach • Information added from a Wikipedia page (W), mapped to a WordNet synset includes: 1.labels; page “Play (theatre)”  add play and theatre 2. set of pages redirecting to W; Playlet redirects to Play 3. set of pages linked from W; links in the page Play (theatre) include literature, comedy, etc. • Synonyms of synset S, associated with Wikipedia page W: WordNet synonyms of S in addition to lemmas of wikipedia information of W".
  • 8. Extended-Lesk WSD approach Feature P R F1 Baseline Synonyms Syn + hypo Syn + gloss examples (WN) Syn + gloss examples (Wiki) Syn + gloss examples (WN + Wiki) Syn + hyper Syn + semRel Syn + hypo + gloss(WN) Syn + hypo + gloss(WN) + hyper Syn + hypo + gloss(WN) + hyper + semRel Syn+hypo+gloss(WN)+hyper+semRel+relGlosses 58.09 59.14 62.16 61.97 61.14 60.21 60.36 59.65 64.92 65.28 65.45 69.76 57.98 59.03 62.07 61.86 61.02 60.10 60.26 59.54 64.81 65.18 65.33 69.66 58.03 59.09 62.12 61.92 61.08 60.16 60.31 59.59 64.86 65.23 65.39 69.71 • Sentences with less than seven words: f-measure of 81.34%
  • 9. Approach – Steps 1. Recognition and disambiguation of Named Entities. 2. Parsing and Disambiguation of the NL query. 3. Matching query terms with ontology concepts and properties. 4. Generation of candidate triples. 5. Integration of triples and generation of SPARQL queries.
  • 10. 1.Recognition and disambiguation of Named Entities • Named entities recognised using AlchemyAPI. • AlchemyAPI had the best recognition performance in NERD evaluation of SOA NE recognizers. • AlchemyAPI exhibits poor disambiguation performance • Each NE is disambiguated using our BabelNet-based WSD approach.
  • 11. 1.Recognition and disambiguation of Named Entities • Example: “In which country does the Nile start?” • Matches of Nile in BabelNet include: – – – – http://dbpedia.org/resource/Nile (singer) http://dbpedia.org/resource/Nile (TV series) http://dbpedia.org/resource/Nile (band) http://dbpedia.org/resource/Nile • Match selected (Nile: river): overlapping terms between sense and query (geography, area, culture, continent) more than other senses.
  • 12. 2.Parsing and Disambiguation of the NL query • Stanford Parser used to gather lemmas and POS tags. • Proper nouns identified by the parser and not recognized by AlchemyAPI are disambiguated and added to the recognized entities. • Example: “In which country does the Nile start?” – The algorithm does not miss the entity Nile, although it was not recognized by AlchemyAPI.
  • 13. 2.Parsing and Disambiguation of the NL query • Example: “Which software has been developed by organizations founded in California?” Output: Word software POS NP position 1 developed organizations founded develop organize find VBN NNS VBN 2 3 4 California • Lemma software California NP 5 Equivalent output generated using keywords or phrases.
  • 14. 3.Matching Query Terms with Ontology Concepts & Properties • Noun phrases, nouns and adjectives are matched with concepts and properties. • Verbs are matched only with properties. • Candidate ontology matches ordered using: Jaro-Winkler and Double Metaphone string similarity algorithms. • Jaro-Winkler threshold to accept a match is set to 0.791, shown in literature to be the best threshold value.
  • 15. 3.Matching Query Terms with Ontology Concepts & Properties • Matching process uses the following in order: 1. query term (e.g., created) 2. lemma (e.g., create) 3. derivationally related forms (creator) • If no matches, disambiguate query term and use expansion terms in order: 1. synonyms 2. hyponyms 3. hypernyms 4. semantic relations (e.g., height as an attribute for tall)
  • 16. 4. Generation of Candidate Query Triples • Structure of the ontology (taxonomy of classes and domain and range of properties) used to link matched concepts and properties and recognized entities to generate query triples. Three-Terms Rule • Each three consecutive terms matched with set of templates. E.g., “Which television shows were created by Walt Disney?” • Template (concept-property-instance) generates triples: ?television_show <dbo:creator> <res:Walt_Disney> ?television_show <dbp:creator> <res:Walt_Disney> ?television_show <dbo:creativeDirector> <res:Walt_Disney>
  • 17. Three-Terms Rule Examples of templates used in three-terms rule: • concept-property-instance – airports located in California – actors born in Germany • instance-property-instance – Was Natalie Portman born in the United States? • property-concept-instance – birthdays of actors of television show Charmed
  • 18. Two-Terms Rule Two-Terms Rule, used when: 1) There is fewer than three derived terms 2) No match between query terms and three-term template 3) Matched template did not generate candidate triples E.g., “In which films directed by Garry Marshall was Julia Roberts starring?” <Garry Marshall, Julia Roberts, starring> : matched to a three-terms template but does not generate triples.
  • 19. Two-Terms Rule Two-Terms Rule Question: “what is the area code of Berlin?” • Template (property-instance) generates the triples: <res:Berlin> <dbp:areaCode> ?area_code <res:Berlin> <dbo:areaCode> ?area_code
  • 20. Comparatives Comparatives Scenarios: 1) Comparative used with a numeric datatype property: e.g., “companies with more than 500,000 employees” ?company <dbp:numEmployees> ?employee ?company <dbp:numberOfEmployees> ?employee ?company a <dboCompany> FILTER (?employee > 500000)
  • 21. Comparatives 2) Comparative is used with a concept: e.g., “places with more than 2 caves” • Generate the same triples for places with caves: ?place a <http://dbpedia.org/ontology/Place>. ?cave a <http://dbpedia.org/ontology/Cave>. ?place ?rel1 ?cave. ?cave ?rel1 ?place. • Add the aggregate restriction: GROUP BY ?place HAVING (COUNT(?cave)>2).
  • 22. Comparatives 3) Comparative is used with an object property e.g., “countries with more than 2 official languages” • Similarly, generate the same triples for country and official language and add the restriction: GROUP BY ?country HAVING (COUNT(?official_language) > 2) 4) Generic Comparatives e.g., “Which mountains are higher than the Nanga Parbat?”
  • 23. Generic Comparatives • Difficulty: identify the property referred to by the comparative term. 1) Select best relation according to query context. – Identify all numeric datatype properties associated with the concept “mountain”, include: “latS, longD, prominence, firstAscent, elevation, longM, …” 2) Disambiguate synsets of all properties and use WSD approach to identify the most related synset to the query. – property elevation is correctly selected
  • 24. 5. Integration of Triples and Generation of SPARQL Queries • Generated triples integrated to produce SPARQL query. • Query term positions used to order the generated triples. • Triples originating from the same query term are executed in order until an answer is found. • Duplicates are removed while merging the triples. • SELECT and WHERE clauses added in addition to any aggregate restrictions or solution modifiers.
  • 25. Evaluation • Test data from 2nd Open Challenge at QALD-2. • Results produced by QALD-2 evaluation tool. • Very promising results: 76% of questions answered correct. Approach Answered Correct Precision Recall F1 BELA QAKiS Alexandria 31 35 25 17 11 5 0.62 0.39 0.43 0.73 0.37 0.46 0.67 0.38 0.45 SenseAware SemSeK MHE 54 80 97 41 32 30 0.51 0.44 0.36 0.53 0.48 0.4 0.52 0.46 0.38
  • 26. Discussion • Design choices affected by priority for precision or recall: 1. Query Relaxation e.g., “Give me all actors starring in Last Action Hero” – Restricting results to actors harms recall – Not all entities in LD are typed, let alone correctly typed – Query relaxation favors recall but affects precision e.g. “How many films did Leonardo DiCaprio star in?” – Return TV series rather than only films such as res:Parenthood (1990 TV series). • Decision: favor precision; keep restriction when specified.
  • 27. Discussion 2. Best or All Matches e.g., “software by organizations founded in California” – Properties matched: foundation and foundationPlace – Using only best match (foundation ) does not generate all results  affects recall. – Using all properties (may not be relevant to the query) would harm precision. • Decision: use all matches; with high value for the similarity threshold; perform checks against the ontology structure to assure relevant matches are only used.
  • 28. Discussion 3. Query Expansion • Can be useful for recall, when the query term is not sufficient to return all answers. • Example: use “website” and “homepage” if any of them used in a query and both have matches in the ontology. • Quality of expansion terms influenced by WSD approach; wrong sense identification will lead to noisy list of terms. • Decision: perform query expansion only when no matches found in the ontology for a term; or no results generated using the identified matches.