SlideShare a Scribd company logo
+

Question Answering on
Interlinked Data
Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Soeren Auer
AKSW Research Group, Leipzig University
December 5 2013, IBM Research Center
+ Motivation
Retrieving information from LOD

AKSW group - Question Answering on Interlinked Data (published in www2013)

2
+ Motivation
Text	
  queries	
  (either	
  keyword	
  or	
  natural	
  language	
  )	
  are:	
  
n 

Simple	
  retrieval	
  approach	
  

n 

Popular	
  

n 

Implicit	
  and	
  ambiguous	
  seman=cs.	
  

SPARQL	
  queries	
  require:	
  
n 

Knowledge	
  about	
  the	
  ontology	
  

n 

Proficiency	
  in	
  formula=ng	
  formal	
  queries	
  	
  

n 

Explicit	
  and	
  unambigious	
  seman=cs.	
  

AKSW	
  group	
  -­‐	
  Ques=on	
  Answering	
  on	
  Interlinked	
  Data	
  (published	
  in	
  www2013)	
  

3
+ Comparison of Search Approaches

Data-Semantic
aware

Data-Semantic
unaware

Our
approach:
SINA

4

Question
Answering
Systems

Information
Retrieval
Keyword-based
query

AKSW group - Question Answering on Interlinked Data (published in www2013)

Natural language
query
+ Example

5

1
n 

3

Which televisions shows were created by Walt Disney?
select * where !
{ ?v0 a
!
?v0 dbo:creator

AKSW group - Question Answering on Interlinked Data (published in www2013)

2
!dbo:TelevisionShow.!
dbr:Walt_Disney. }!
+ Aim and Challenges

Aim: Question answering over a set of interlinked data sources.
n 

Query segmentation.

n 

Resource disambiguation.

n 

To construct a formal query (expressed in SPARQL)

AKSW group - Question Answering on Interlinked Data (published in www2013)

6
+ Further Challenges over Interlinked Data
1. 

Information for answering a certain question can be spread
among different datasets employing heterogeneous schemas.

2. 

Constructing a federated formal query across different datasets
requires exploiting links between the different datasets on both the
schema and instance levels.

AKSW group - Question Answering on Interlinked Data (published in www2013)

7
+ SINA Architecture

AKSW group - Question Answering on Interlinked Data (published in www2013)

8
+ Test bed datasets
*  One single dataset: DBpedia.
*  Three interlinked datasets
from life-science:

ü  Drugbank: is a
comprehensive knowledge
base containing information
about drugs, drug target (i.e.
protein) information,
interactions and enzymes.

ü  Diseasome: contains
information about diseases and
genes associated with these
diseases.

ü  Sider: contains information
about drugs and their side effects.

AKSW group - Question Answering on Interlinked Data (published in www2013)

9
+ Main characteristics of federated queries
1. 

Queries requiring fused information, e.g. side
effects of drugs used for Tuberculosis.

2. 

Queries targeting combined information, e.g.
side effect an enzymes of drugs used for ASTHMA.

3. 

10

Queries requiring keyword expansion, e.g. side
effects of Valdecoxib.

DrugBank

Sider
Drug

a

a
?v1

enzyme

?v0

Disease

?v2
sameAs

a
Diseasome

AKSW group - Question Answering on Interlinked Data (published in www2013)

Side Effect

Drug

a

Enzymes

Asthma

a
side effect

?v3
+ Challenge 1: Query Segmentation and Resource
Disambiguation

l 

Sample	
  ques5on:	
  What	
  is	
  the	
  side	
  effects	
  of	
  drugs	
  used	
  for	
  Tuberculosis?	
  	
  

l 

	
  Transformed	
  to	
  4-­‐tuple	
  (side	
  #	
  effect	
  #	
  drug	
  #	
  Tuberculosis)	
  

l 

Different	
  segmenta=ons	
  are	
  possible:	
  	
  
1. 

(	
  side	
  effect	
  #	
  drug	
  #	
  Tuberculosis)	
  

2. 

(	
  side	
  effect	
  drug	
  #	
  Tuberculosis	
  )

Mapping	
  of	
  the	
  segments	
  to	
  the	
  resources	
  in	
  the	
  underlying	
  knowledge	
  bases.	
  
Each valid segment

AKSW group - Question Answering on Interlinked Data (published in www2013)

11
12

Segment validation
	
  
ü 
ü 

	
  Original tuple: (side # effect # drug # Tuberculosis).
Using a naive approach for finding all valid segments.

	
  

Valid Segments

Samples of Candidate Resources

Side effect

1.  sider:class:sideeffect
!
2.  sider:property:side_effects!

drug

1. drugbank: drugs
2.class:offer!
3.sider:drugs
4.diseases:possibledrug!

tuberculosis

1.  diseases:1154
!
2.  side_effects: C0041296!

AKSW group - Question Answering on Interlinked Data (published in www2013)
+

13

Concurrent	
  
Segmenta5on	
  and	
  Disambigua5on	
  	
  

AKSW group - Question Answering on Interlinked Data (published in www2013)
14

Hidden Markov Model

• 
• 
• 
• 

A statistics model containing a set of states.
Moving from one state to another state generates a sequence of observations.
The probability of entering state only depends on the previous state.
Output is the most likely states generating the sequence of the observation.

AKSW group - Question Answering on Interlinked Data (published in www2013)
15

State Space

• 
• 
• 
• 

A state represents a knowledge base resource.
Contains all resources in the knowledge base.
In practice, we prune the state space by excluding irrelevant states.
Adding an unknown entity state comprising all resources, which are not
available (anymore) in the pruned state space.

•  Extension of State Space with reasoning: An extension of the state space
by including resources inferred from lightweight owl:sameAs reasoning.

AKSW group - Question Answering on Interlinked Data (published in www2013)
16

Bootstrapping the Model Parameters
Emission Probability
• 

The set-similarity level measures the difference between the label and the
segment in terms of the number of words using the Jaccard similarity.

• 

The string-similarity level measures the string similarity of each word in the
segment with the most similar word in the label using the Levenshtein
distance.

AKSW group - Question Answering on Interlinked Data (published in www2013)
17

Bootstrapping the Model Parameters
Transition Probability & Initial Probability
•  Computing the transition probability and initial probability based on Semantic
relatedness of two resources.
•  Semantic relatedness is based on two values: distance and connectivity
degree.
•  We transform these two values to hub and authority values using HITS
algorithm.
•  Initial probability and Transition probability
are defined as a uniform
distribution over the hub and and authority values.

AKSW group - Question Answering on Interlinked Data (published in www2013)
Evaluation of Bootstrapping

18

•  The accuracy of different distribution functions, i.e., Normal, Zipfian and
uniform distributions for transition probability.
•  We ran the distribution functions with two different inputs, i.e. distance and
connectivity degree values as well as hub and authority values.

AKSW group - Question Answering on Interlinked Data (published in www2013)
+ Viterbi Algorithm
Aim: The most likely path generating the sequence of input keywords.

AKSW group - Question Answering on Interlinked Data (published in www2013)

19
+

20

Output of the HMM for the following query:
Which televisions shows were created by Walt Disney?
Probability
0.0023
0.0014
5.89E-4
3.53E-4
3.76E-5

Path of states

dbo:TelevisionShow , dbo:creator , dbr:
dbo:TelevisionShow , dbo:creator , dbr:
dbr:TelevisionShow , dbo:creator , dbr:
dbr:TelevisionShow , dbo:creator , dbr:
dbp:television , dbp:show , dbo:creator

AKSW group - Question Answering on Interlinked Data (published in www2013)

Walt_Disney!
Category:Walt_Disney!
Walt_Disney!
Category:Walt_Disney!
, dbr: Category:Walt_Disney!
+

21

Query Construction	
  	
  

AKSW group - Question Answering on Interlinked Data (published in www2013)
Query Construction Method

Input: set of resources R = {r , r ,..., r }
Output: A query graph QG = (V, E)
is a directed, connected multi-graph.
1

2

n

Forward Chaining:
1.  CT: Comprehensive type.
2.  CD: Comprehensive domain.
3.  CR: Comprehensive range.

AKSW group - Question Answering on Interlinked Data (published in www2013)

22
Query Construction Method

Input: set of resources R = {r , r ,..., r }
Output: A query graph QG = (V, E)
is a directed, connected multi-graph.
1

2

n

Generating the Incomplete Query Graph (IQG)
Initializing vertices and primary edges.
•  A vertex is added to IQG (1) If r is an instance, (2) If r is a class.
•  Properties are added along with zero, one or two vertices.

AKSW group - Question Answering on Interlinked Data (published in www2013)

23
24

Query Construction Method

Example: What is the side effects of drugs used for Tuberculosis?
•  diseasome:1154 !
!
•  diseasome:possibleDrug !
•  sider:sideEffect !
!(type

!(type
!(type

Graph 1

!!

property)

sideEffect

possibleDrug
1154

instance) !!
property)!

?v0

?v1
Graph 2

AKSW group - Question Answering on Interlinked Data (published in www2013)

?v2
25

Query Construction Method

Connecting Sub-graphs of an IQG:
1.  Minimum spanning tree: a minimum set of edges (i.e., properties) to span a set of
disjoint graphs.
2.  Prim’s algorithm: incrementally includes edges to connect disjoint sub-graphs.
•  Direct properties: ?v0 ?p ?v1.
•  Properties via owl:sameAs link.
(1) ?v0 owl:sameAs ?x. ?x ?p ?v1. !
(2) ?v0 ?p ?x. ?x owl:sameAs ?v1. !
(3) ?v0 owl:sameAs ?x. ?x ?p ?y. ?y owl:sameAs ?v1. !

Template 1

Template 2

possibleDrug
1154

?v0

1154

?v2

?v1

sideEffect
?v1

AKSW group - Question Answering on Interlinked Data (published in www2013)

possibleDrug

sideEffect

?v0

?v2
Evaluation

Goal of experiment:
How well:
1.  resource disambiguation
2.  query construction approaches perform.
Measurement of the performance:
1.  For disambiguation using the Mean Reciprocal Rank (MRR).
2.  Query construction in terms of precision and recall.
Benchmark
1.  A natural- language query and the equivalent conjunctive SPARQL query.
2.  25 queries on the 3 interlinked datasets Drugbank, Sider and Diseasome.
3.  QALD1 and QALD3 benchmark for DBpedia.

AKSW group - Question Answering on Interlinked Data (published in www2013)

26
Evaluation using life-science datasets

Without reasoning: precision = 0.91 recall = 0.88
With reasoning:
precision = 0.95 recall = 0.90
AKSW group - Question Answering on Interlinked Data (published in www2013)

27
+ Evaluation using DBpedia
n 

QALD3 Benchmark:

ü 

contains 100 questions.

ü 

32 original questions can be answered correctly.

n 

QALD1 Benchmark:

ü 

contains 50 questions.

ü 

7 complex questions.

ü 

13 questions requiring information beyond DBpedia, i.e., from YAGO and FOAF.

ü 

14 slightly were modified to remove expansion and cleaning problem.

ü 

MRR of disambiguation = 96%

ü 

Query construction accuracy = 83%

AKSW group - Question Answering on Interlinked Data (published in www2013)

28
Runtime

Parallization over three components:
1.  Segment validation
2.  Resource retrieval
3.  Query construction

AKSW group - Question Answering on Interlinked Data (published in www2013)

29
+ Related work

AKSW group - Question Answering on Interlinked Data (published in www2013)

30
31

Thank you

Saeedeh Shekarpour
shekarpour@informatik-leipzig.de
sa.shekarpour@gmail.com
AKSW group - Question Answering on Interlinked Data (published in www2013)

More Related Content

What's hot

Public PhD Defense - Ben De Meester
Public PhD Defense - Ben De MeesterPublic PhD Defense - Ben De Meester
Public PhD Defense - Ben De Meester
Ben De Meester
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
gpano
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
Paul Hofmann
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
Marko Rodriguez
 
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked DataISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
Evangelia Daskalaki
 
Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
ebrahim_bagheri
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
Roy Clariana
 
Fasta
FastaFasta
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
Sameera Horawalavithana
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
Joe Raad
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engine
Lars Marius Garshol
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
Trey Grainger
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
Sujit Pal
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
Amrapali Zaveri, PhD
 

What's hot (17)

Public PhD Defense - Ben De Meester
Public PhD Defense - Ben De MeesterPublic PhD Defense - Ben De Meester
Public PhD Defense - Ben De Meester
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked DataISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
 
Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
 
Pula 5 Giugno 2007
Pula 5 Giugno 2007Pula 5 Giugno 2007
Pula 5 Giugno 2007
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
 
Fasta
FastaFasta
Fasta
 
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engine
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
 
Sub1579
Sub1579Sub1579
Sub1579
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 

Similar to Sina presentation in IBM

NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
kelbedweihy
 
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
BigMine
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Data
ebrahim_bagheri
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
Vrije Universiteit Amsterdam
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Andre Freitas
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
Rich Heimann
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked Data
Carlos Pedrinaci
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
Arijit Khan
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Maribel Acosta Deibe
 
Mcs 021
Mcs 021Mcs 021
Mcs 021
Ujjwal Kumar
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
Stuart Wrigley
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Databricks
 
Multivariate Data Analysis Project Report
Multivariate Data Analysis Project ReportMultivariate Data Analysis Project Report
Multivariate Data Analysis Project Report
Utkarsh Agrawal
 
VOLT - ESWC 2016
VOLT - ESWC 2016VOLT - ESWC 2016
VOLT - ESWC 2016
Blake Regalia
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
Fabien Gandon
 
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Vijay Srinivas Agneeswaran, Ph.D
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
hala Skaf
 

Similar to Sina presentation in IBM (20)

NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Data
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked Data
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
 
Mcs 021
Mcs 021Mcs 021
Mcs 021
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Multivariate Data Analysis Project Report
Multivariate Data Analysis Project ReportMultivariate Data Analysis Project Report
Multivariate Data Analysis Project Report
 
VOLT - ESWC 2016
VOLT - ESWC 2016VOLT - ESWC 2016
VOLT - ESWC 2016
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
 
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 

More from Saeedeh Shekarpour

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Saeedeh Shekarpour
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
Saeedeh Shekarpour
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
Saeedeh Shekarpour
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
Saeedeh Shekarpour
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
Saeedeh Shekarpour
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Saeedeh Shekarpour
 

More from Saeedeh Shekarpour (7)

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 

Recently uploaded

How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
Nguyen Thanh Tu Collection
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 

Recently uploaded (20)

How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 

Sina presentation in IBM

  • 1. + Question Answering on Interlinked Data Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Soeren Auer AKSW Research Group, Leipzig University December 5 2013, IBM Research Center
  • 2. + Motivation Retrieving information from LOD AKSW group - Question Answering on Interlinked Data (published in www2013) 2
  • 3. + Motivation Text  queries  (either  keyword  or  natural  language  )  are:   n  Simple  retrieval  approach   n  Popular   n  Implicit  and  ambiguous  seman=cs.   SPARQL  queries  require:   n  Knowledge  about  the  ontology   n  Proficiency  in  formula=ng  formal  queries     n  Explicit  and  unambigious  seman=cs.   AKSW  group  -­‐  Ques=on  Answering  on  Interlinked  Data  (published  in  www2013)   3
  • 4. + Comparison of Search Approaches Data-Semantic aware Data-Semantic unaware Our approach: SINA 4 Question Answering Systems Information Retrieval Keyword-based query AKSW group - Question Answering on Interlinked Data (published in www2013) Natural language query
  • 5. + Example 5 1 n  3 Which televisions shows were created by Walt Disney? select * where ! { ?v0 a ! ?v0 dbo:creator AKSW group - Question Answering on Interlinked Data (published in www2013) 2 !dbo:TelevisionShow.! dbr:Walt_Disney. }!
  • 6. + Aim and Challenges Aim: Question answering over a set of interlinked data sources. n  Query segmentation. n  Resource disambiguation. n  To construct a formal query (expressed in SPARQL) AKSW group - Question Answering on Interlinked Data (published in www2013) 6
  • 7. + Further Challenges over Interlinked Data 1.  Information for answering a certain question can be spread among different datasets employing heterogeneous schemas. 2.  Constructing a federated formal query across different datasets requires exploiting links between the different datasets on both the schema and instance levels. AKSW group - Question Answering on Interlinked Data (published in www2013) 7
  • 8. + SINA Architecture AKSW group - Question Answering on Interlinked Data (published in www2013) 8
  • 9. + Test bed datasets *  One single dataset: DBpedia. *  Three interlinked datasets from life-science: ü  Drugbank: is a comprehensive knowledge base containing information about drugs, drug target (i.e. protein) information, interactions and enzymes. ü  Diseasome: contains information about diseases and genes associated with these diseases. ü  Sider: contains information about drugs and their side effects. AKSW group - Question Answering on Interlinked Data (published in www2013) 9
  • 10. + Main characteristics of federated queries 1.  Queries requiring fused information, e.g. side effects of drugs used for Tuberculosis. 2.  Queries targeting combined information, e.g. side effect an enzymes of drugs used for ASTHMA. 3.  10 Queries requiring keyword expansion, e.g. side effects of Valdecoxib. DrugBank Sider Drug a a ?v1 enzyme ?v0 Disease ?v2 sameAs a Diseasome AKSW group - Question Answering on Interlinked Data (published in www2013) Side Effect Drug a Enzymes Asthma a side effect ?v3
  • 11. + Challenge 1: Query Segmentation and Resource Disambiguation l  Sample  ques5on:  What  is  the  side  effects  of  drugs  used  for  Tuberculosis?     l   Transformed  to  4-­‐tuple  (side  #  effect  #  drug  #  Tuberculosis)   l  Different  segmenta=ons  are  possible:     1.  (  side  effect  #  drug  #  Tuberculosis)   2.  (  side  effect  drug  #  Tuberculosis  ) Mapping  of  the  segments  to  the  resources  in  the  underlying  knowledge  bases.   Each valid segment AKSW group - Question Answering on Interlinked Data (published in www2013) 11
  • 12. 12 Segment validation   ü  ü   Original tuple: (side # effect # drug # Tuberculosis). Using a naive approach for finding all valid segments.   Valid Segments Samples of Candidate Resources Side effect 1.  sider:class:sideeffect ! 2.  sider:property:side_effects! drug 1. drugbank: drugs 2.class:offer! 3.sider:drugs 4.diseases:possibledrug! tuberculosis 1.  diseases:1154 ! 2.  side_effects: C0041296! AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 13. + 13 Concurrent   Segmenta5on  and  Disambigua5on     AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 14. 14 Hidden Markov Model •  •  •  •  A statistics model containing a set of states. Moving from one state to another state generates a sequence of observations. The probability of entering state only depends on the previous state. Output is the most likely states generating the sequence of the observation. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 15. 15 State Space •  •  •  •  A state represents a knowledge base resource. Contains all resources in the knowledge base. In practice, we prune the state space by excluding irrelevant states. Adding an unknown entity state comprising all resources, which are not available (anymore) in the pruned state space. •  Extension of State Space with reasoning: An extension of the state space by including resources inferred from lightweight owl:sameAs reasoning. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 16. 16 Bootstrapping the Model Parameters Emission Probability •  The set-similarity level measures the difference between the label and the segment in terms of the number of words using the Jaccard similarity. •  The string-similarity level measures the string similarity of each word in the segment with the most similar word in the label using the Levenshtein distance. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 17. 17 Bootstrapping the Model Parameters Transition Probability & Initial Probability •  Computing the transition probability and initial probability based on Semantic relatedness of two resources. •  Semantic relatedness is based on two values: distance and connectivity degree. •  We transform these two values to hub and authority values using HITS algorithm. •  Initial probability and Transition probability are defined as a uniform distribution over the hub and and authority values. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 18. Evaluation of Bootstrapping 18 •  The accuracy of different distribution functions, i.e., Normal, Zipfian and uniform distributions for transition probability. •  We ran the distribution functions with two different inputs, i.e. distance and connectivity degree values as well as hub and authority values. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 19. + Viterbi Algorithm Aim: The most likely path generating the sequence of input keywords. AKSW group - Question Answering on Interlinked Data (published in www2013) 19
  • 20. + 20 Output of the HMM for the following query: Which televisions shows were created by Walt Disney? Probability 0.0023 0.0014 5.89E-4 3.53E-4 3.76E-5 Path of states dbo:TelevisionShow , dbo:creator , dbr: dbo:TelevisionShow , dbo:creator , dbr: dbr:TelevisionShow , dbo:creator , dbr: dbr:TelevisionShow , dbo:creator , dbr: dbp:television , dbp:show , dbo:creator AKSW group - Question Answering on Interlinked Data (published in www2013) Walt_Disney! Category:Walt_Disney! Walt_Disney! Category:Walt_Disney! , dbr: Category:Walt_Disney!
  • 21. + 21 Query Construction     AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 22. Query Construction Method Input: set of resources R = {r , r ,..., r } Output: A query graph QG = (V, E) is a directed, connected multi-graph. 1 2 n Forward Chaining: 1.  CT: Comprehensive type. 2.  CD: Comprehensive domain. 3.  CR: Comprehensive range. AKSW group - Question Answering on Interlinked Data (published in www2013) 22
  • 23. Query Construction Method Input: set of resources R = {r , r ,..., r } Output: A query graph QG = (V, E) is a directed, connected multi-graph. 1 2 n Generating the Incomplete Query Graph (IQG) Initializing vertices and primary edges. •  A vertex is added to IQG (1) If r is an instance, (2) If r is a class. •  Properties are added along with zero, one or two vertices. AKSW group - Question Answering on Interlinked Data (published in www2013) 23
  • 24. 24 Query Construction Method Example: What is the side effects of drugs used for Tuberculosis? •  diseasome:1154 ! ! •  diseasome:possibleDrug ! •  sider:sideEffect ! !(type !(type !(type Graph 1 !! property) sideEffect possibleDrug 1154 instance) !! property)! ?v0 ?v1 Graph 2 AKSW group - Question Answering on Interlinked Data (published in www2013) ?v2
  • 25. 25 Query Construction Method Connecting Sub-graphs of an IQG: 1.  Minimum spanning tree: a minimum set of edges (i.e., properties) to span a set of disjoint graphs. 2.  Prim’s algorithm: incrementally includes edges to connect disjoint sub-graphs. •  Direct properties: ?v0 ?p ?v1. •  Properties via owl:sameAs link. (1) ?v0 owl:sameAs ?x. ?x ?p ?v1. ! (2) ?v0 ?p ?x. ?x owl:sameAs ?v1. ! (3) ?v0 owl:sameAs ?x. ?x ?p ?y. ?y owl:sameAs ?v1. ! Template 1 Template 2 possibleDrug 1154 ?v0 1154 ?v2 ?v1 sideEffect ?v1 AKSW group - Question Answering on Interlinked Data (published in www2013) possibleDrug sideEffect ?v0 ?v2
  • 26. Evaluation Goal of experiment: How well: 1.  resource disambiguation 2.  query construction approaches perform. Measurement of the performance: 1.  For disambiguation using the Mean Reciprocal Rank (MRR). 2.  Query construction in terms of precision and recall. Benchmark 1.  A natural- language query and the equivalent conjunctive SPARQL query. 2.  25 queries on the 3 interlinked datasets Drugbank, Sider and Diseasome. 3.  QALD1 and QALD3 benchmark for DBpedia. AKSW group - Question Answering on Interlinked Data (published in www2013) 26
  • 27. Evaluation using life-science datasets Without reasoning: precision = 0.91 recall = 0.88 With reasoning: precision = 0.95 recall = 0.90 AKSW group - Question Answering on Interlinked Data (published in www2013) 27
  • 28. + Evaluation using DBpedia n  QALD3 Benchmark: ü  contains 100 questions. ü  32 original questions can be answered correctly. n  QALD1 Benchmark: ü  contains 50 questions. ü  7 complex questions. ü  13 questions requiring information beyond DBpedia, i.e., from YAGO and FOAF. ü  14 slightly were modified to remove expansion and cleaning problem. ü  MRR of disambiguation = 96% ü  Query construction accuracy = 83% AKSW group - Question Answering on Interlinked Data (published in www2013) 28
  • 29. Runtime Parallization over three components: 1.  Segment validation 2.  Resource retrieval 3.  Query construction AKSW group - Question Answering on Interlinked Data (published in www2013) 29
  • 30. + Related work AKSW group - Question Answering on Interlinked Data (published in www2013) 30
  • 31. 31 Thank you Saeedeh Shekarpour shekarpour@informatik-leipzig.de sa.shekarpour@gmail.com AKSW group - Question Answering on Interlinked Data (published in www2013)