Optique (http://optique-project.eu) was an EU FP7 project that ran from November 2012 to October 2016. The main objective was to test the idea of “Ontology Based Data Access” (OBDA) on real industrial applications. Concretely: to support the work of geologists and geophysicists in the oil & gas company Statoil, and the work of turbine engineers at Siemens AG. This line of work now continues in the nationally funded ‘Centre for Research-based Innovation’ SIRIUS (http://sirius-labs.no) at the University of Oslo, with participation from the Universities of Oxford and Trondheim, as well a large number of participating companies.
The software produced by the project features elaborate user interfaces, and no ∀ or ∃ can be seen on the surface. Still, most of the functionality is controlled by an ontology, which is nothing more than a set of axioms in a particular description logic. As a consequence, a variety of reasoning tasks takes place under the hood, all the way from query optimisation, via entity alignment and up to the user interface control code. This talk presents a selection of these problems, both solved and as-yet unsolved.
Though logic and reasoning are close to the hearts of many of the researchers involved, the success of the project was also dependent on other factors: inter- disciplinary communication, usability considerations, and many pragmatic com- promises, to name some. And sometimes, these would again lead to ‘nice’ re- search. The talk also covers some of these extra-logical aspects of the project.
2. Scalable End-user Access to Big Data
http://optique-project.eu/
HELLENIC REPUBLIC
National and Kapodistrian
University of Athens
2 / 28
3. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
The Problem of Data Access
Engineer
uniform sources
Application
predefined queries
answers
3 / 28
4. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
When does this Go Wrong?
I need to find all rock samples where my Company had
at least a 30% share of the licence at the time the sample
was taken. I’m sure the information is there but there
are so many concepts involved that I can’t find it in the
application.
I need all wellbores with a pore pressure of over 14ppg,
but lower than 12ppg further down the hole. I can’t say
this to the application.
I need to find all rock samples for this oil field, including
the ones in this Excel sheet from Dinoco. The application
doesn’t know about this data.
4 / 28
5. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
What then happens?
• Where is this information stored, and what is it called?
• Can you hand-craft a query for my information need?
• Can you include data from this spreadsheet in the db?
• May take weeks to respond
• Takes several years to master data stores and user
needs
5 / 28
6. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
What then happens?
• Where is this information stored, and what is it called?
• Can you hand-craft a query for my information need?
• Can you include data from this spreadsheet in the db?
• May take weeks to respond
• Takes several years to master data stores and user
needs
30–70% of domain expert time spent looking for and assessing
the quality of the data found
5 / 28
7. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
The Problem of Data Access
Engineer
uniform sources
Application
predefined queries
answers
6 / 28
8. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
The Problem of Data Access
Engineer
disparate sources
Application
IT-expert
information need specialised
query
answers
6 / 28
9. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Data Access: The Optique Solution
Engineer
disparate sources
Application
ontology-based
query
translated
query
answers
7 / 28
10. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Data Access: The Optique Solution
Engineer
disparate sources
Application
ontology-based
query
translated
query
answers
Onto-
logy
Map-
pings
7 / 28
11. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Ontology-based Data Access
• Capture End-user vocabulary in an “Ontology”
• ≈ Domain model
• Classes and relations known to end-users
• Some minimal domain knowledge
• Mappings that relate Ontology with data sources
• ‘Column “Type” is “T” in row x of table “Sensors” if sensor Nr. x is a Temperature Sensor’
• Automatically translate queries in End-user language to queries over data sources.
In: ‘List all temperature sensors.’
Out: ‘Print “Sensor Nr. x” for all rows x in “Sensors” table where “Type” column is “T.”’
8 / 28
12. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
9 / 28
13. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
9 / 28
14. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
∅
9 / 28
15. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
Condenser ⊑ CoolingDevice ⊓
∃isPartOf.Turbine
CondenserFault ≡ Fault ⊓
∃affects.Condenser
TurbineFault ≡ Fault ⊓ ∃affects.(
∃isPartOf.Turbine)
9 / 28
16. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
Condenser ⊑ CoolingDevice ⊓
∃isPartOf.Turbine
CondenserFault ≡ Fault ⊓
∃affects.Condenser
TurbineFault ≡ Fault ⊓ ∃affects.(
∃isPartOf.Turbine)
9 / 28
17. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
OBDA: Example
engineer
Generators with
a turbine fault?
Based on slides by
Ian Horrocks
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
Condenser ⊑ CoolingDevice ⊓
∃isPartOf.Turbine
CondenserFault ≡ Fault ⊓
∃affects.Condenser
TurbineFault ≡ Fault ⊓ ∃affects.(
∃isPartOf.Turbine)
g1
9 / 28
18. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Certain Answers
Given:
• T – ontology: smallish number of complex formulae
• A – database: large number of atomic formulae
• q – query: some formula with free variables
We define the certain answers of q
ans(q, (T , A)) := {σ | (T , A) |= σq}
Example:
q = ∃ f (Generator(g) ∧ hasFault(g,f) ∧ TurbineFault(f))
(T , A) |= Generator(g1) ∧ hasFault(g1,f1) ∧ TurbineFault(f1)
⇒ [g → g1] ∈ ans(q, (T , A))
10 / 28
19. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Query Rewriting
• Certain answers ans(q, (T , A)) are expensive to compute
• T will conatain hundreds or thousands of formulae
• A will conatain millions of atoms
• In certain cases possible by rewriting:
q′
:= rewrite(q, T )
such that
ans(q′
, (∅, A)) = ans(q, (T , A))
• Query answering with empty ontology is cheap (same as SQL)
• Possible e.g. if
• T is in the DL-Lite description logic
• q is a disjunction of ∃-quantified conjunctions of atoms
11 / 28
20. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Rewriting Example
A :
Generator(g1)
hasFault(g1, f1)
CondenserFault(f1)
T :
Condenser ⊑ CoolingDevice ⊓
∃isPartOf.Turbine
CondenserFault ≡ Fault ⊓
∃affects.Condenser
TurbineFault ≡ Fault ⊓ ∃affects.(
∃isPartOf.Turbine)
q = ∃f. Generator(g) ∧
hasFault(g, f) ∧
TurbineFault(f)
Rewrite with T :
rewrite(q, T ) =
q′
= ∃f. Generator(g) ∧
hasFault(g, f) ∧
CondenserFault(f)
∨ · · ·
Answers from q′
:
ans(q′
, (∅, A)) = {[g → g1] . . .}
12 / 28
21. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Optique Focus Areas
Basic principles of OBDA predate Optique
Optique focussed on practical issues:
• Usability
• How do end-users formulate queries? In first-order logic?
• Need a user interface for ‘query formulation’
• Scope
• What about queries with time? Or geology? Or chemistry?
• Extended bare-bones query rewriting with time and streams
• Prerequisites
• Where do the ontology and mappings come from?
• How do you maintain them?
• Efficiency
• SQL databases not good at queries from OBDA
• Big Data is maybe not best stored in an SQL database
• Optimize rewritten queries and storage layer
13 / 28
22. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Optique Architecture
End-user IT-expert
Data models
Std. ontologies
…
Visualisation
& Analysis
Query
Formulation
Ontology & Mapping
Management
Ontology MappingsQueries
Query Transformation
Query Planning
Query Execution Query Execution Query Execution
· · · · · ·
results
streaming data temporal data static data
central repository
14 / 28
23. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Ontop: Query Transformation
Query Tansformation
ON-LINE OFF-LINE
Reasoner
Ontology
Mapping-
Optimiser
Mappings
DB Integrity Constraints
Classified
Ontology
T -mapping
Query
Query Rewriter
SQL query
SPARQL to SQL
Translator
D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro, and G. Xiao. Ontop:
Answering SPARQL queries over relational databases. Semantic Web, 8(3):471–487, 2017.
15 / 28
24. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Ontop: Query Transformation
Query Tansformation
ON-LINE OFF-LINE
Reasoner
Ontology
Mapping-
Optimiser
Mappings
DB Integrity Constraints
Classified
Ontology
T -mapping
QuerySPARQL!
Query Rewriter
SQL query
SPARQL to SQL
Translator
D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro, and G. Xiao. Ontop:
Answering SPARQL queries over relational databases. Semantic Web, 8(3):471–487, 2017.
15 / 28
25. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Ontop: Query Transformation
Query Tansformation
ON-LINE OFF-LINE
Reasoner
OntologyOWL2 QL!
Mapping-
Optimiser
Mappings
DB Integrity Constraints
Classified
Ontology
T -mapping
QuerySPARQL!
Query Rewriter
SQL query
SPARQL to SQL
Translator
D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro, and G. Xiao. Ontop:
Answering SPARQL queries over relational databases. Semantic Web, 8(3):471–487, 2017.
15 / 28
26. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Ontop: Query Transformation
Query Tansformation
ON-LINE OFF-LINE
Reasoner
OntologyOWL2 QL!
Mapping-
Optimiser
MappingsR2RML!
DB Integrity Constraints
Classified
Ontology
T -mapping
QuerySPARQL!
Query Rewriter
SQL query
SPARQL to SQL
Translator
D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro, and G. Xiao. Ontop:
Answering SPARQL queries over relational databases. Semantic Web, 8(3):471–487, 2017.
15 / 28
27. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
16 / 28
28. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
16 / 28
29. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
16 / 28
30. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
16 / 28
31. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
16 / 28
32. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
16 / 28
33. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
16 / 28
34. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
16 / 28
35. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
16 / 28
36. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
• Most of which will totally confuse your end users.
16 / 28
37. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
• Most of which will totally confuse your end users.
• Take some HCI people instead…
16 / 28
38. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
• Most of which will totally confuse your end users.
• Take some HCI people instead…
• …and you will get a very nice interface…
16 / 28
39. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
• Most of which will totally confuse your end users.
• Take some HCI people instead…
• …and you will get a very nice interface…
• …that is not consistent with what the ontology means…
16 / 28
40. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to build an Interface
• Assume you want a graphical user interface…
• It needs to output SPARQL…
• Basically conjunctions of atoms
• Negation as failure
• Full first order expressivity
• Aggregation
• and then some…
• You have unlimited access to PhD students and PostDocs with a logic background
• You will get: an insanely powerful SPARQL editor…
• Most of which will totally confuse your end users.
• Take some HCI people instead…
• …and you will get a very nice interface…
• …that is not consistent with what the ontology means…
• which will also confuse your end users.
16 / 28
41. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
How to Run a Project
• Talking to end users, compiled a Query Catalog
• Information Need; Expert SQL query; SPARQL formulations
• Iteratively, with demos to use case owners
• Useful for setting expectations
• Sorry, there will be no “where should we drill” button!
• Useful for thinking about expressivity
• E.g. 2 of 90 queries have negation
• Useful for modeling
• Make an ontology that describes what is needed for the queries
• Many ontology-based projects fail by trying to model the world
• Useful for benchmarking
• Went from handling 50% of queries to handling 95% of queries
• Useful for prioritising
17 / 28
42. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Query Formulation
Split the task in two:
• Oxford makes back-end. Ontology, reasoning, etc.
• Oslo makes front-end. Interaction paradigms, etc.
• Use a lot of time to communicate!
18 / 28
43. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Reasoning in Query Formulation
• This is about Big Data!
• Can’t expect to consult data in the user interface.
• VQS controlled by (small) Ontology instead of (big) data.
• Reasoning task:
• Given an ontology T and C(x)
• Which atoms R(x, y) should be allowed to add, i.e.
T , C(x) ∧ R(x, y) ̸|= ⊥
• Which are most “sensible”? Ranking, based on…
• Ontology (all wells have a depth)
• Data (preprocessed…)
• Log of previous queries (users always ask about this)
• Expert intervention (never show this internal ID)
A. Soylu, E. Kharlamov, D. Zheleznyakov, E. Jimenez Ruiz, M. Giese, M. G. Skjæveland, D. Hovland, R. Schlatte, S. Brandt, H.
Lie, and I. Horrocks. OptiqueVQS: a visual query system over ontologies for industry. Semantic Web, (in press), 2017.
19 / 28
44. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Value Suggestion
• Given an unfinished query:
q = OilField(x) ∧ opBy(x, y) ∧ Company(y)
• Add that the name of the company is “Equinor”
OilField(x) ∧ opBy(x, y) ∧ Company(y) ∧ name(y, “Equinor”)
• What values for the name should VQS suggest?
• Given only OilField(x) all company names, also non-operators.
• Find a with ans(q ∧ name(y, a), (T , A)) ̸= ∅
• Have prototype using A
• What can be done with T ?
V. Klungre, M. Giese. Approximating Faceted Search for Graph Queries. Intl. Workshop on Scalable Semantic Web
Knowledge Base Systems, 2018.
20 / 28
45. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Combining Datasources
Querying many data sources as one…
1. Ontology Alignment (Schema Alignment)
• “Person” in source A is the same as “Human” in source B
• “Town” in Source A is the same as “Settlement” in source B if over 10000 inhabitants
2. Using ontology alignment in query transformation
• Axioms expressing connection may not be in supported logic
3. Entity alignment
• “Martin” in source A is the same as “Agent999” in source B
4. Using entity alignment in query transformation
• …see next slide…
5. Distributing transformed (SQL) queries to data sources
• “Query Federation”
21 / 28
46. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
22 / 28
47. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
22 / 28
48. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
22 / 28
49. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
22 / 28
50. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
22 / 28
51. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• “Classical” OBDA approach: map to common identifier
A:Martin → http://xyz/MG B:Agent999 → http://xyz/MG
22 / 28
52. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• “Classical” OBDA approach: map to common identifier
A:Martin → http://xyz/MG B:Agent999 → http://xyz/MG
• Hard to maintain a URI scheme :-(
22 / 28
53. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• “Classical” OBDA approach: map to common identifier
A:Martin → http://xyz/MG B:Agent999 → http://xyz/MG
• Hard to maintain a URI scheme :-(
• Joins through URIs only efficient if based on primary key :-(
22 / 28
54. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
23 / 28
55. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to a relation “same”:
Person(x) ∧ livesIn(x, y) ∧ same(x, x′) ∧ worksIn(x′, z)
23 / 28
56. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to a relation “same”:
Person(x) ∧ livesIn(x, y) ∧ same(x, x′) ∧ worksIn(x′, z)
• Who inserts that “same” into the query? A geologist!?
23 / 28
57. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to a relation “same”:
Person(x) ∧ livesIn(x, y) ∧ same(x, x′) ∧ worksIn(x′, z)
• Who inserts that “same” into the query? A geologist!?
• Not like querying a single data source.
23 / 28
58. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
24 / 28
59. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to equality: =
24 / 28
60. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to equality: =
• Which is transitive
24 / 28
61. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to equality: =
• Which is transitive
• Need to build that into query transformation or execution
24 / 28
62. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Example
A:Person
A:ID A:livesIn
Martin Siggerud
Arild Hvalstad
B:Human
B:ID B:worksIn
Agent007 London
Agent999 Oslo
Alignment
A:ID B:ID
Martin Agent999
• Ontology Alignment: Agent ≡ Human
• Entity Alignment: A:Martin ≡ B:Agent999
• Query: Find all persons x who live in y and work in z
• Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
• Mapping alignment table to equality: =
• Which is transitive
• Need to build that into query transformation or execution
• Would require recusive SQL, raising complexity
24 / 28
63. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Alignment Solution
Observation:
• Every entity has only one ID per data source
• Otherwise: clean up first!
• Need to consider only equality between data sources
• Only have finitely many data sources
• Only need fixed finite length of equality application chains!
Equality handling can be part of query rewriting!
Input: Person(x) ∧ livesIn(x, y) ∧ worksIn(x, z)
Output: Person(x) ∧ livesIn(x, y) ∧ same(x, x′) ∧ worksIn(x′, z)
SQL federation: Exareme (http://www.exareme.org/) from Athens
G. Xiao, D. Hovland, D. Bilidas, M. Rezk, M. Giese, D. Calvanese. «Efficient Ontology-Based Data
Integration with canonical IRIs», ESWC 2018.
25 / 28
64. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
More Reasoning?
• Query transformation and execution
• Query containment: entailment between formulae
• Interesting with datatypes!
• Ontology and Mapping management
• Projection of ontologies from more to less expressive language
• Mappings contradicting the ontology
• Ontology Templates (ongoing)
• Like a macro language
• What semantic relations between templates are useful?
• Query rewriting vs. Materialisation (ongoing)
• Up-front forward reasoning on part of the data
• How to combine this with query rewriting for the rest?
26 / 28
65. Centre For Scalable Data
Access In The Oil & Gas Industry
sirius-labs.no
Status at end of project
1000 queries
on
1000 sensors
10TB/day
Online
training
modules
Querying 4TB
federated over
five databases
First paying
customer
Verified
usability
All
components
integrated
M. Giese, A. Soylu, G. Vega-Gorgojo, A. Waaler, P. Haase, E. Jiménez-Ruiz, D. Lanti, M. Rezk, G. Xiao, Ö. Özçep, and R.
Rosati. Optique: Zooming in on Big Data. IEEE Computer, 48(3), 2015.
27 / 28