SlideShare a Scribd company logo
1 of 44
How hard is this query?
Measuring the Semantic Complexity of
Schema-agnostic Queries
André Freitas, Juliano Efson Sales,
Siegfried Handschuh, Edward Curry
IWCS, London 2015
Outline
• Motivation
• Query Semantic Complexity & Entropy
• Entropy Measures
• Validation & Analysis
• Conclusions
Motivation
Shift in the Database Landscape
 Very-large and dynamic “schemas”.
10s-100s attributes
1,000s-1,000,000s attributes
before 2000
circa 2015
4 Brodie & Liu, 2010
Databases for a Complex World
How do you query data on this scenario?
5
Schema-agnosticism
Abstraction
Layer
6
Who is the daughter of
Bill Clinton?
Bill
Clinton
Chelsea
Clinton
child
Schema-agnostic queries
Query approaches over structured databases which
allow users satisfying complex information needs
without the understanding of the representation
(schema) of the database.
7
Semantic Parsing
Vocabulary Problem for Databases
Query: Who is the daughter of Bill Clinton married to?
Quantify the Semantic Gap
Possible representations
8
Core Questions
• Can we measure the semantic complexity of a
query-DB mapping?
• What defines an “easy” or a “hard” query?
• Which are the best estimators?
9
Semantic Complexity &
Entropy
Configuration space of semantic matchings
Quantify the Query-DB semantic gap
Not all queries are born equal!
11
Semantic Complexity & Entropy
Semantic Complexity & Entropy
• Structural/conceptual complexity
• Level of ambiguity/indeterminacy/vagueness
• Teminological gap
• Novelty
12
Semantic Configuration Space
mΣ(Q,DB)
13
Semantic Entropy
Measures
Semantic Entropy Measures
Hsyntax
15
?
Hstruct
Hterm
HtermHmatching
In the scope of this work
• Entropy -> Entropy estimator, approximation.
16
Syntactic Entropy (Hsyntax)
• The syntactic entropy of a query is defined by the
possible syntactic configurations in which a query
can be interpreted under the database syntax.
• Estimate the uncertainty of the translation of the
query into the DB categories (IDB(Q)).
• Is a function of the probability of the syntactic
interpretation of a query.
17
Structural Entropy (Hstruct)
• The structural entropy defines the complexity of a
database based on the possible facts that can be
encoded under its schema.
• Pollard & Biermann, A measure of semantic
complexity for natural language systems (2000).
18
Terminological Entropy (Hterm)
• The terminological entropy focuses on quantifying an
estimate on the amount of ambiguity, synonymy and
vagueness for the query or database terms.
• Translational Entropy (Htrans) as an estimator.
• Melamed, Measuring semantic entropy (1997).
• Translation probability based on parallel corpora.
19
Matching Entropy (Hmatching)
• Consists of measures which describe the
uncertainty involved in the query-data
matching/alignment between query terms and
dataset entities.
• Provides an estimate based on the set of
potential alignments.
• Distributional entropy (Hdist): Estimator based on
distributional semantic models.
20
Query Features as Complexity
Estimators
• Query features (reference to data model/query
operator categories).
– Contains instance reference (named entities)
– Contains class reference
– Contains complex class reference
– Contains property
– Contains value
– Yes/No question
– Contains operator
21
Validation & Analysis
Experimental Set-up
• Question Answering over Linked Data Test
Collection (Unger et al. 2011).
• QALD 2011 & 2012.
• 150 natural language queries over DBpedia
(RDF).
Dataset (DBpedia + YAGO classes):
45,768 properties
288,316 classes
9,434,677 instances
128,071,259 triples
23
Query Analysis Example
24
Query Analysis Example
25
Experimental Set-up
• Linear regression between each entropy
measure and the f-measure of the
participating QA systems.
• 4 QA systems:
– QALD 2011: PowerAqua, Freya (κ = 0.501, 95% confidence
interval, ‘moderate’ agreement).
– QALD 2012: QAKis, MHE (κ= 0.236, 95% confidence
interval, ‘fair’ agreement).
26
1st Analysis
• Linear regression model.
• Hsyntax, Hterm (Htrans), Hmatching (Hdist) and Hstruct
27
1st Analysis
• Higher correlation:
– Hsyntax (-)
– Hterm (Htrans) (-)
– Hmatching (Hdist) (-)
• Lower correlation:
– Hstruct
28
2nd Analysis
• Query features (reference to data model/query
operator categories).
– Contains instance reference (named entities)
– Contains class reference
– Contains complex class reference
– Contains property
– Contains value
– Yes/No question
– Contains operator
29
2nd Analysis
• Linear regression model.
30
2nd Analysis
• Higher correlation:
– References to instances (+)
– Presence of operators (-)
– Presence of complex classes (complex nominals) (-)
31
3rd Analysis
• Classification of the query-DB
terminological gap for each data
model category.
32
3rd Analysis
Lower terminological gap
Higher terminological gap
Query Classification
...34
Query Classification
• % of unanswered questions:
– Syntactic complexity (Hsyntax): 51.7%
– Vocabulary gap (Hmatching, Hterm): 68.9%
– No reference to instance (named entity)
(Hstruct,Hterm): 20.6%
35
Limitations
• Validation of the regression model in a
different test collection.
• Distributional entropy needs a more
principled definition.
36
Minimizing Semantic
Entropy
Reflections on the Design of Schema-
agnostic Query Mechanisms
Or ....
Minimizing the Semantic Entropy for
the Semantic Matching
Definition of a semantic pivot: first query term to
be resolved in the database.
 Maximizes the reduction of the semantic
configuration space (Hstruct , Hmatch).
38
Semantic Pivots (Hstruct , Hmatch)
• Who is the daughter of Bill Clinton married to?
437100,184 62,781
> 4,580,000
dbpedia:spouse dbpedia:children :Bill_Clinton
39
Minimizing the Semantic Entropy for
the Semantic Matching
Definition of a semantic pivot: first query term
to be resolved in the database.
 Maximizes the reduction of the semantic
configuration space (Hstruct , Hmatch).
 Less prone to more complex synonymic
expressions and abstraction-level differences
(Hterm , Hmatch).
40
Semantic Pivots
• Proper nouns tends to have high percentage of string
overlap for synonymic expressions.
William Jefferson Clinton
Bill Clinton
William J. Clinton
T. E. Lawrence
Thomas Edward
Lawrence
Lawrence of Arabia
Who is the daughter of Bill Clinton married to?
41
Minimizing the Semantic Entropy for
the Semantic Matching
Definition of a semantic pivot: first query term to be
resolved in the database.
 Maximizes the reduction of the semantic
configuration space (Hstruct , Hmatch).
 Less prone to more complex synonymic expressions
and abstraction-level differences (Hterm , Hmatch).
 proper nouns >> nouns >> complex nominals >>
adjectives , verbs.
42
Semantic Matching
• Hsyntax is a strong estimator of query
complexity.
• Hmatching can be used as an estimator for the
quality of the predicate alignment.
• Hterm can be used as a heuristic for matching
complexity.
43
Conclusions
• Both entropy (Hsyntax, Hterm, Hmatching) and query features
(instances, complex classes, operators) can be used as
estimators for query semantic complexity.
• This can be incorporated as heuristics into schema-
agnostic query planning approaches (or approximate
semantic parsing) to maximize semantic matching
probabilities.
• Need for the construction of better semantic entropy
estimators.
44

More Related Content

What's hot

Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsAndre Freitas
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...Andre Freitas
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataAndre Freitas
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
 
Data Integration Ontology Mapping
Data Integration Ontology MappingData Integration Ontology Mapping
Data Integration Ontology MappingPradeep B Pillai
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalDustin Smith
 
Ontology Mapping
Ontology MappingOntology Mapping
Ontology Mappingbutest
 
Framester and WFD
Framester and WFD Framester and WFD
Framester and WFD Aldo Gangemi
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesExplanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesDaniel Sonntag
 
Ontology Mapping
Ontology MappingOntology Mapping
Ontology Mappingsamhati27
 
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answeringAli Kabbadj
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241Urjit Patel
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyIJwest
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 

What's hot (20)

Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Data Integration Ontology Mapping
Data Integration Ontology MappingData Integration Ontology Mapping
Data Integration Ontology Mapping
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Ontology Mapping
Ontology MappingOntology Mapping
Ontology Mapping
 
Framester and WFD
Framester and WFD Framester and WFD
Framester and WFD
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesExplanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
 
Ontology Mapping
Ontology MappingOntology Mapping
Ontology Mapping
 
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
 
Learning ontologies
Learning ontologiesLearning ontologies
Learning ontologies
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontology
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 

Viewers also liked

How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?Andre Freitas
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Bill Slawski
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in ImpalaCloudera, Inc.
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Giannis Tsakonas
 
SPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesSPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesKnud Möller
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through EntitiesPeter Mika
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge GraphLukas Masuch
 

Viewers also liked (9)

How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
 
Knowledge graph
Knowledge graphKnowledge graph
Knowledge graph
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
 
SPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesSPARQL - Basic and Federated Queries
SPARQL - Basic and Federated Queries
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Similar to How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic Queries

Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Andre Freitas
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Andre Freitas
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataAndre Freitas
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.pptNaglaaFathy42
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
 
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...Università degli Studi di Milano-Bicocca
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
Question Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesMichael Petychakis
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanPeter Berger
 
Kdd 2014 tutorial bringing structure to text - chi
Kdd 2014 tutorial   bringing structure to text - chiKdd 2014 tutorial   bringing structure to text - chi
Kdd 2014 tutorial bringing structure to text - chiBarbara Starr
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search ComponentMario Flecha
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...Daniel Katz
 
Multi-Model Data Query Languages and Processing Paradigms
Multi-Model Data Query Languages and Processing ParadigmsMulti-Model Data Query Languages and Processing Paradigms
Multi-Model Data Query Languages and Processing ParadigmsJiaheng Lu
 
Semantic Analytics, Smart Data
Semantic Analytics, Smart DataSemantic Analytics, Smart Data
Semantic Analytics, Smart DataArthur Keen
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Mariana Damova, Ph.D
 

Similar to How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic Queries (20)

Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured Data
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.ppt
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
Question Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning Issues
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
Realizing Semantic Web - Light Weight semantics and beyond
Realizing Semantic Web - Light Weight semantics and beyondRealizing Semantic Web - Light Weight semantics and beyond
Realizing Semantic Web - Light Weight semantics and beyond
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David Newman
 
Kdd 2014 tutorial bringing structure to text - chi
Kdd 2014 tutorial   bringing structure to text - chiKdd 2014 tutorial   bringing structure to text - chi
Kdd 2014 tutorial bringing structure to text - chi
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
 
Multi-Model Data Query Languages and Processing Paradigms
Multi-Model Data Query Languages and Processing ParadigmsMulti-Model Data Query Languages and Processing Paradigms
Multi-Model Data Query Languages and Processing Paradigms
 
Semantic Analytics
Semantic AnalyticsSemantic Analytics
Semantic Analytics
 
Semantic Analytics, Smart Data
Semantic Analytics, Smart DataSemantic Analytics, Smart Data
Semantic Analytics, Smart Data
 
Fusing semantic data
Fusing semantic dataFusing semantic data
Fusing semantic data
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 

More from Andre Freitas

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAndre Freitas
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ ManchesterAndre Freitas
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsAndre Freitas
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018Andre Freitas
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsAndre Freitas
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2Andre Freitas
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...Andre Freitas
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackAndre Freitas
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsAndre Freitas
 
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachAndre Freitas
 
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Andre Freitas
 

More from Andre Freitas (13)

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category Descriptors
 
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic Queries

  • 1. How hard is this query? Measuring the Semantic Complexity of Schema-agnostic Queries André Freitas, Juliano Efson Sales, Siegfried Handschuh, Edward Curry IWCS, London 2015
  • 2. Outline • Motivation • Query Semantic Complexity & Entropy • Entropy Measures • Validation & Analysis • Conclusions
  • 4. Shift in the Database Landscape  Very-large and dynamic “schemas”. 10s-100s attributes 1,000s-1,000,000s attributes before 2000 circa 2015 4 Brodie & Liu, 2010
  • 5. Databases for a Complex World How do you query data on this scenario? 5
  • 6. Schema-agnosticism Abstraction Layer 6 Who is the daughter of Bill Clinton? Bill Clinton Chelsea Clinton child
  • 7. Schema-agnostic queries Query approaches over structured databases which allow users satisfying complex information needs without the understanding of the representation (schema) of the database. 7 Semantic Parsing
  • 8. Vocabulary Problem for Databases Query: Who is the daughter of Bill Clinton married to? Quantify the Semantic Gap Possible representations 8
  • 9. Core Questions • Can we measure the semantic complexity of a query-DB mapping? • What defines an “easy” or a “hard” query? • Which are the best estimators? 9
  • 11. Configuration space of semantic matchings Quantify the Query-DB semantic gap Not all queries are born equal! 11 Semantic Complexity & Entropy
  • 12. Semantic Complexity & Entropy • Structural/conceptual complexity • Level of ambiguity/indeterminacy/vagueness • Teminological gap • Novelty 12
  • 16. In the scope of this work • Entropy -> Entropy estimator, approximation. 16
  • 17. Syntactic Entropy (Hsyntax) • The syntactic entropy of a query is defined by the possible syntactic configurations in which a query can be interpreted under the database syntax. • Estimate the uncertainty of the translation of the query into the DB categories (IDB(Q)). • Is a function of the probability of the syntactic interpretation of a query. 17
  • 18. Structural Entropy (Hstruct) • The structural entropy defines the complexity of a database based on the possible facts that can be encoded under its schema. • Pollard & Biermann, A measure of semantic complexity for natural language systems (2000). 18
  • 19. Terminological Entropy (Hterm) • The terminological entropy focuses on quantifying an estimate on the amount of ambiguity, synonymy and vagueness for the query or database terms. • Translational Entropy (Htrans) as an estimator. • Melamed, Measuring semantic entropy (1997). • Translation probability based on parallel corpora. 19
  • 20. Matching Entropy (Hmatching) • Consists of measures which describe the uncertainty involved in the query-data matching/alignment between query terms and dataset entities. • Provides an estimate based on the set of potential alignments. • Distributional entropy (Hdist): Estimator based on distributional semantic models. 20
  • 21. Query Features as Complexity Estimators • Query features (reference to data model/query operator categories). – Contains instance reference (named entities) – Contains class reference – Contains complex class reference – Contains property – Contains value – Yes/No question – Contains operator 21
  • 23. Experimental Set-up • Question Answering over Linked Data Test Collection (Unger et al. 2011). • QALD 2011 & 2012. • 150 natural language queries over DBpedia (RDF). Dataset (DBpedia + YAGO classes): 45,768 properties 288,316 classes 9,434,677 instances 128,071,259 triples 23
  • 26. Experimental Set-up • Linear regression between each entropy measure and the f-measure of the participating QA systems. • 4 QA systems: – QALD 2011: PowerAqua, Freya (κ = 0.501, 95% confidence interval, ‘moderate’ agreement). – QALD 2012: QAKis, MHE (κ= 0.236, 95% confidence interval, ‘fair’ agreement). 26
  • 27. 1st Analysis • Linear regression model. • Hsyntax, Hterm (Htrans), Hmatching (Hdist) and Hstruct 27
  • 28. 1st Analysis • Higher correlation: – Hsyntax (-) – Hterm (Htrans) (-) – Hmatching (Hdist) (-) • Lower correlation: – Hstruct 28
  • 29. 2nd Analysis • Query features (reference to data model/query operator categories). – Contains instance reference (named entities) – Contains class reference – Contains complex class reference – Contains property – Contains value – Yes/No question – Contains operator 29
  • 30. 2nd Analysis • Linear regression model. 30
  • 31. 2nd Analysis • Higher correlation: – References to instances (+) – Presence of operators (-) – Presence of complex classes (complex nominals) (-) 31
  • 32. 3rd Analysis • Classification of the query-DB terminological gap for each data model category. 32
  • 33. 3rd Analysis Lower terminological gap Higher terminological gap
  • 35. Query Classification • % of unanswered questions: – Syntactic complexity (Hsyntax): 51.7% – Vocabulary gap (Hmatching, Hterm): 68.9% – No reference to instance (named entity) (Hstruct,Hterm): 20.6% 35
  • 36. Limitations • Validation of the regression model in a different test collection. • Distributional entropy needs a more principled definition. 36
  • 37. Minimizing Semantic Entropy Reflections on the Design of Schema- agnostic Query Mechanisms Or ....
  • 38. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database.  Maximizes the reduction of the semantic configuration space (Hstruct , Hmatch). 38
  • 39. Semantic Pivots (Hstruct , Hmatch) • Who is the daughter of Bill Clinton married to? 437100,184 62,781 > 4,580,000 dbpedia:spouse dbpedia:children :Bill_Clinton 39
  • 40. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database.  Maximizes the reduction of the semantic configuration space (Hstruct , Hmatch).  Less prone to more complex synonymic expressions and abstraction-level differences (Hterm , Hmatch). 40
  • 41. Semantic Pivots • Proper nouns tends to have high percentage of string overlap for synonymic expressions. William Jefferson Clinton Bill Clinton William J. Clinton T. E. Lawrence Thomas Edward Lawrence Lawrence of Arabia Who is the daughter of Bill Clinton married to? 41
  • 42. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database.  Maximizes the reduction of the semantic configuration space (Hstruct , Hmatch).  Less prone to more complex synonymic expressions and abstraction-level differences (Hterm , Hmatch).  proper nouns >> nouns >> complex nominals >> adjectives , verbs. 42
  • 43. Semantic Matching • Hsyntax is a strong estimator of query complexity. • Hmatching can be used as an estimator for the quality of the predicate alignment. • Hterm can be used as a heuristic for matching complexity. 43
  • 44. Conclusions • Both entropy (Hsyntax, Hterm, Hmatching) and query features (instances, complex classes, operators) can be used as estimators for query semantic complexity. • This can be incorporated as heuristics into schema- agnostic query planning approaches (or approximate semantic parsing) to maximize semantic matching probabilities. • Need for the construction of better semantic entropy estimators. 44