SlideShare a Scribd company logo
1 of 72
Download to read offline
Ontologies Ontop Databases 
Martin Rezk (PhD) 
Faculty of Computer Science 
Free University of Bozen-Bolzano, Italy 
Semantic Web Applications and Tools for Life Sciences 
9/12/MMXIV 
1 / 61
Who am I? 
Postdoc at: Free University of Bozen-Bolzano, Bolzano, Italy. 
Research topics: Ontology-based Data Access (OBDA), 
Efficient Query Answering, Query rewriting, Data integration. 
Leader of the -ontop- project. 
2 / 61
What are we going to learn today? 
How to organize and access your data using ontologies. 
How to do it with our system: –ontop–. 
How to use this approach for data integration and consistency 
checking. 
3 / 61
Declaimer 
Part of the material here has been taken from a tutorial by 
Mariano Rodriguez (2011) 
http://www.slideshare.net/marianomx/ontop-a-tutorial 
4 / 61
Declaimer: License 
License 
This work is supported by the Optique Project and is licensed 
under the Creative Commons Attribution-Share Alike 3.0 License 
http://creativecommons.org/licenses/by-sa/3.0/ 
5 / 61
Overview 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
6 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
7 / 61
Ontop and Optique: Why? 
8 / 61
When does this Go Wrong? 
I’m sure the information is there but there are so many 
concepts involved that I can’t find it in the application. 
I can’t say what I want to the application. 
The application doesn’t know about this data. 
They Get Lost !! 
9 / 61
How they Tackle This Problem 
Engineer 
information need specialised 
Application 
IT-expert 
query 
answers 
10 / 61
How they Tackle This Problem 
May take weeks to respond. 
Takes several years to master data stores and user 
needs. 
Siemens loses up to 50.000.000 euro per year because of this!! 
11 / 61
Data Access Bottleneck 
12 / 61
Optique 
Provides a semantic end-to-end connection between users and 
data sources; 
Enables users to rapidly formulate intuitive queries using 
familiar vocabularies and conceptualisations. 
13 / 61
How Optique Tackles the Problem 
Velocity Volume 
BIG DATA 
Variety Complexity 
(i) Providing scalable big/fast data infrastructures. 
(ii) Providing the ability to cope with variety and complexity in the 
data. 
(iii) Providing a good understanding of the data. 
14 / 61
How Optique Tackles the Problem 
Velocity Volume 
BIG DATA 
Variety Complexity 
(i) Providing scalable big/fast data infrastructures. 
(ii) Providing the ability to cope with variety and complexity in the 
data. (OBDA) 
(iii) Providing a good understanding of the data. (OBDA) 
14 / 61
-ontop-: Our OBDA System 
-ontop- is a platform to query databases as virtual RDF Graphs 
using SPARQL, and exploiting OWL 2 QL ontologies and 
mappings 
It is currently being developed in the context of the Optique 
project. 
Development of -ontop- started 4.5 years ago. 
-ontop- is already well established 
website has +-500 visits per month 
got 500 new registrations in the past year 
-ontop- is open-source and released under Apache license. 
We support all major relational DBs (Oracle, DB2, Postgres, 
MySQL, etc.) 
15 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
16 / 61
Ontology-based data access (OBDA) 
Ontology-based 
Data Access 
Ontology 
Source Source 
Source 
Mapping 
Queries 
Data source(s): are external and independent (possibly multiple 
and heterogeneous). 
Ontology: provides a unified common vocabulary, and a 
conceptual view of the data. 
Mappings: relate each term in the ontology to a set of (SQL) 
views over (possibly federated) data. 
17 / 61
Simple Life Science Running Example 
Data source(s): Hospital Databases with cancer patients (First 
1, then 2) 
Ontology: A common domain vocabulary defining Patient, 
Cancer, LungCancer, etc. 
Mappings: Relating the vocabulary and the databases. 
18 / 61
Pre-requisites 
Before we start you need: 
Java 1.7 (check typing: java -version) 
The material online: 
https://www.dropbox.com/sh/316cavgoavjtnu3/ 
AAADoau5NuNGq3zXO4JJQ6rya?dl=0 
19 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
20 / 61
DB Engines and SQL 
Standard way to store LARGE volumes of data: Mature, 
Robust and FAST. 
Domain is structured as tables, data becomes rows in these 
tables. 
Powerful query language (SQL) to retrieve this data. 
Major companies developed SQL DBs for the last 30 years 
(IBM, Microsoft, Oracle).. 
21 / 61
Data Source 
Cancer Patient Database 1 
Table: tbl_patient 
PatientId Name Type Stage 
1 Mary false 4 
2 John true 7 
22 / 61
Data Source 
Cancer Patient Database 1 
Table: tbl_patient 
PatientId Name Type Stage 
1 Mary false 4 
2 John true 7 
Type is: 
false for Non-Small Cell Lung Cancer (NSCLC) 
true for Small Cell Lung Cancer (SCLC) 
Stage is: 
1-6 for NSCLC stages: I,II,III,IIIa,IIIb,IV 
7-8 for SCLC stages: Limited,Extensive 
22 / 61
Creating the DB in H2 
H2 is a pure java SQL database 
Just unzip the downloaded package 
Easy to run, just run the scripts: 
Open a terminal (in mac Terminal.app, in windows run cmd.exe) 
Move to the H2 folder (e.g., cd h2) 
Start H2 using the h2 scripts 
sh h2.sh (in mac/linux - You might need “chmod u+x h2.sh ”) 
h2w.bat (in windows) 
23 / 61
How it looks: 
jdbc:h2:tcp: = protocol information 
locahost = server location 
helloworld= database name 24 / 61
How to access it from the web 
25 / 61
Creating the table 
You can use the files create.sql and insert.sql 
CREATE TABLE "tbl_patient" ( 
patientid INT NOT NULL PRIMARY KEY, 
name VARCHAR(40), 
type BOOLEAN, 
stage TINYINT 
) 
Adding Data: 
INSERT INTO "tbl_patient" 
(patientid,name,type,stage) 
VALUES 
(1,'Mary',false,4), 
(2,'John',true,7); 
26 / 61
Example SQL Query 
Patients with type false and stage IIIa or above (select.sql) 
SELECT patientid 
FROM "tbl_patient" 
WHERE 
TYPE = false AND stage >= 4 
27 / 61
The More Meaningful Question 
Give me the id and the name of the patients with a tumor at stage 
IIIa 
28 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
29 / 61
Ontologies: Conceptual view 
Definition 
An artifact that contains a vocabulary, relations between 
the terms in the vocabulary, and that is expressed in a 
language whose syntax and semantics (meaning of the 
syntax) are shared and agreed upon. 
Mike type Patient 
NSCLC subClassOf LungCancer 
LungCancer subClassOf Cancer 
30 / 61
Ontologies: Protege 
Go to the protégé-ontop folder from your material. This is a 
Protégé 4.3 package that includes the ontop plugin 
Run Protégé from the console using the run.bat or run.sh 
scripts. That is, execute: 
cd Protege_4.3/; run.sh 
31 / 61
The ontology: Creating Concepts and 
Properties 
Add the concept: Patient. 
(See PatientOnto.owl) 
32 / 61
The ontology: Creating Concepts and 
Properties 
Add these object properties: 
(See PatientOnto.owl) 
32 / 61
The ontology: Creating Concepts and 
Properties 
Add these data properties: 
(See PatientOnto.owl) 
32 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
33 / 61
Mappings 
We have the vocabulary, the database, now we need to link 
those two. 
Mappings define triples (subject, property, object) out of SQL 
queries. 
These triples is accessible during query time (the on-the-fly 
approach) or can be imported into the OWL ontology (the ETL 
approach) 
Definition (Intuition) 
A mapping have the following form: 
TripleTemplate   SQL Query to build the triples 
represents triples constructed from each result row 
returned by the SQL query in the mapping. 
34 / 61
The Mappings 
p.Id Name Type Stage 
1 Mary false 2 
(:db1/{p.id},type, :Patient)   Select p.id From tbl_patient 
(:db1/{p.id},:hasName, {name})   Select p.id,name From tbl_patient 
(:db1/{p.id},:hasNeoplasm, :db1/neoplasm/{p.id})   
Select p.id From tbl_patient 
(:db1/neoplasm/{p.id},:hasStage, :stage-IIIa)   
Select p.id From tbl_patient where stage=4 
35 / 61
The Mappings 
p.Id Name Type Stage 
1 Mary false 2 
(:db1/{p.id},type, :Patient)   Select p.id From tbl_patient 
(:db1/{p.id},:hasName, {name})   Select p.id,name From tbl_patient 
(:db1/{p.id},:hasNeoplasm, :db1/neoplasm/{p.id})   
Select p.id From tbl_patient 
(:db1/neoplasm/{p.id},:hasStage, :stage-IIIa)   
Select p.id From tbl_patient where stage=4 
35 / 61
The Mappings 
Using the Ontop Mapping tab, we now need to define the 
connection parameters to our lung cancer database 
Steps: 
1. Switch to the Ontop Mapping tab 
2. Add a new data source (give it a name, e.g., PatientDB) 
3. Define the connection parameters as follows: 
Connection URL: jdbc:h2:tcp://localhost/helloworld 
Username: sa 
Password: (leave empty) 
Driver class: org.h2.Driver (choose it from the drop down menu) 
4. Test the connection using the “Test Connection” button 
36 / 61
The Mappings 
37 / 61
The Mappings 
Switch to the “Mapping Manager” tab in the ontop mappings 
tab. 
Select your datasource 
click Create: 
target: :db1/{patientid} a :Patient . 
source: SELECT patientid FROM "tbl_patient" 
target: :db1/{patientid} :hasName {name} . 
source: Select patientid,name FROM "tbl_patient" 
target: :db1/{patientid} :hasNeoplasm :db1/neoplasm/{patientid}. 
source: SELECT patientid FROM "tbl_patient" 
target: :db1/neoplasm/{patientid} :hasStage :stage-IIIa . 
source: SELECT patientid FROM "tbl_patient" where stage=4 
38 / 61
The Mappings 
Now we classify the neoplasm individual using our knowledge of 
the database. 
We know that “false” in the table patient indicates a “Non 
Small Cell Lung Cancer”, so we classify the patients as a 
:NSCLC. 
39 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
40 / 61
Virtual Graph 
Data: 
The vocabulary is more domain oriented and independent from 
the DB. 
No more values to encode types or stages. 
Later, this will allow us to easily integrate new data or domain 
information (e.g., an ontology). 
Our data sources are now documented!. 
41 / 61
Virtual Graph 
Data and Inference: 
There is a new individual :db1/neoplasm/1 that stands for the 
cancer (tumor) of Mary. This allows the user to query specific 
properties of the tumor independently of the patient. 
We get extra information as shown above. 
41 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
42 / 61
On-the-fly access to the DB 
Recall our information need: Give me the id and the name of 
the patients with a tumor at stage IIIa. 
Enable Ontop in the “Reasoner” menu 
43 / 61
On-the-fly access to the DB 
In the ontop SPARQL tab add all the prefixes 
44 / 61
On-the-fly access to the DB 
Write the SPARQL Query 
SELECT ?p ?name WHERE 
{ ?p rdf:type :Patient . 
?p :hasName ?name . 
?p :hasNeoplasm ?tumor . 
?tumor :hasStage :stage-IIIa .} 
Click execute 
This is the main way to access data in ontop and its done by 
querying ontop with SPARQL. 
45 / 61
How we do Inference... 
We embed inference into the query 
We do not need to reason with the (Big) Data 
46 / 61
Standards 
Ontology languages: RDF, RDFS, OWL (W3C 
recommendations) 
Query: SQL, SPARQL 
Mappings: R2RML (W3C recommendation) 
47 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
48 / 61
What about Data Integration? 
Cancer Patient Database 2 
T_Name 
PId Nombre 
1 Anna 
2 Mike DB information is distributed in multiple tables. 
The IDs of the two DBs overlap. 
T_NSCLC 
Id hosp Stge 
1 X two 
2 Y one Information is encoded differently. E.g. Stage of 
cancer is text (one, two...) 
T_SCLC 
key hosp St 
1 XXX 
2 YYY 49 / 61
New Mappings 
50 / 61
New Mappings 
50 / 61
New Mappings 
The URI’s for the new individuals differentiate the data sources 
(db2 vs. db1) 
Being an instance of NSCLC and SCLC depends now on the 
table, not a column value 
50 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
51 / 61
Consistency 
A logic based ontology language, such as OWL, allows 
ontologies to be specified as logical theories, this implies that 
it is possible to constrain the relationships between concepts, 
properties, and data. 
In OBDA inconsistencies arise when your mappings violate the 
constraints imposed by the ontology. 
In OBDA we have two types of constraints: 
Disjointness: The intersection between classes Patient and 
Employee should be empty. There can also be disjoint 
properties. 
Functional Properties: Every patient has at most one name. 
52 / 61
Consistency: Setting up a Constraint 
53 / 61
Consistency: Building a wrong mapping 
54 / 61
Consistency: Checking Inconsistency 
55 / 61
Consistency: Finding out the Problem 
56 / 61
Outline 
Introduction: Optique and Ontop 
Ontology Based Data Access 
The Database: 
Ontologies 
Mappings 
Virtual Graph 
Querying 
Data Integration 
Checking Consistency 
Conclusions 
57 / 61
Conclusions 
Ontologies give you a common vocabulary to formulate the 
queries, and mappings to find the answers. 
Ontologies and Semantic Technology can help to handle the 
problem of accessing Big Data 
Diversity: 
Using ontologies describing particular domains allows to hide 
the storage complexity. 
Agreement on data identifiers allows for integration of datasets. 
Understanding: Agreement on vocabulary allow to better define 
your data and allows for easy information exchange. 
There is no need of computationally expensive ETL processes. 
Reasoning is scalable because we reason at the query level. 
You do not need to have everything ready to use it! 
58 / 61
What I left outside this talk... 
Semantic Query Optimisation 
SWRL and Recursion 
Performance Evaluation 
Aggregates and bag semantics 
Give out about Database Engines 
Tons of theory 
etc. etc. etc... 
59 / 61
Thanks!!! 
THANKS!!! 
http://ontop.inf.unibz.it 
www.optique-project.eu
Extra: Where Reasoning takes Place 
If we pose the query asking for all the instances of the class 
Neoplasm: 
SELECT ?x WHERE { ?x rdf:type :Noeplasms). } 
61 / 61
Extra: Where Reasoning takes Place 
If we pose the query asking for all the instances of the class 
Neoplasm: 
SELECT ?x WHERE { ?x rdf:type :Noeplasms). } 
(Intuitively) -ontop- will translate it into: 
SELECT ?x WHERE { { ?x rdf:type :Neoplasms. } 
UNION 
{ ?x rdf:type :BenignNeoplasms. } 
UNION 
{ ?x rdf:type :MalignantNeoplasm. } 
UNION 
... 
{ ?x rdf:type :NSCLC). } 
UNION 
{ ?x rdf:type :SCLC). } } 
61 / 61
Extra: Where Reasoning takes Place 
If we pose the query asking for all the instances of the class 
Neoplasm: 
SELECT ?x WHERE { ?x rdf:type :Noeplasms). } 
(Intuitively) -ontop- will translate it into: 
SELECT ?x WHERE { {?x rdf:type :Neoplasms.} 
UNION 
{ ?x rdf:type :BenignNeoplasms. } 
UNION 
{ ?x rdf:type :MalignantNeoplasm . } 
UNION 
... 
{ ?x rdf:type :NSCLC). } 
UNION 
{ ?x rdf:type :SCLC). } } 
61 / 61
Extra: Where Reasoning takes Place 
If we pose the query asking for all the instances of the class 
Neoplasm: 
SELECT ?x WHERE { ?x rdf:type :Noeplasms). } 
(Intuitively) -ontop- will translate it into: 
SELECT Concat(:db1/neoplasm/, TBL.PATIENT.id) AS ?x 
FROM TBL.PATIENT 
61 / 61

More Related Content

What's hot

Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software Repositories
Markus Scheidgen
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
Luis Daniel Ibáñez
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Valery Tkachenko
 

What's hot (20)

Analysis of the “KDD Cup-1999” Datasets
Analysis of the  “KDD Cup-1999”  DatasetsAnalysis of the  “KDD Cup-1999”  Datasets
Analysis of the “KDD Cup-1999” Datasets
 
10 xrd-software
10 xrd-software10 xrd-software
10 xrd-software
 
IO Streams, Serialization, de-serialization, autoboxing
IO Streams, Serialization, de-serialization, autoboxingIO Streams, Serialization, de-serialization, autoboxing
IO Streams, Serialization, de-serialization, autoboxing
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
sqlmap internals
sqlmap internalssqlmap internals
sqlmap internals
 
Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software Repositories
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
 
sqlmap internals
sqlmap internalssqlmap internals
sqlmap internals
 
Reference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based DatasetsReference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based Datasets
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
OntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KPOntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KP
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
Text mining meets neural nets
Text mining meets neural netsText mining meets neural nets
Text mining meets neural nets
 
14. Files - Data Structures using C++ by Varsha Patil
14. Files - Data Structures using C++ by Varsha Patil14. Files - Data Structures using C++ by Varsha Patil
14. Files - Data Structures using C++ by Varsha Patil
 
Implementing dual model systems
Implementing dual model systemsImplementing dual model systems
Implementing dual model systems
 

Viewers also liked

Viewers also liked (15)

Wegwijs in zedelgem_2015-infogids
Wegwijs in zedelgem_2015-infogidsWegwijs in zedelgem_2015-infogids
Wegwijs in zedelgem_2015-infogids
 
Top Indian Mobile App to pay Bar and restaurant bills.
Top Indian Mobile App to pay Bar and restaurant bills.Top Indian Mobile App to pay Bar and restaurant bills.
Top Indian Mobile App to pay Bar and restaurant bills.
 
CA External WAAE Roadmap - UK User Group - CA Workload Automation Technology ...
CA External WAAE Roadmap - UK User Group - CA Workload Automation Technology ...CA External WAAE Roadmap - UK User Group - CA Workload Automation Technology ...
CA External WAAE Roadmap - UK User Group - CA Workload Automation Technology ...
 
Bottled water canmixing case-study-ihwf india
Bottled water canmixing case-study-ihwf indiaBottled water canmixing case-study-ihwf india
Bottled water canmixing case-study-ihwf india
 
Question 2
Question 2Question 2
Question 2
 
Wegwijs in moorslede_informatiegids2013-2015
Wegwijs in moorslede_informatiegids2013-2015Wegwijs in moorslede_informatiegids2013-2015
Wegwijs in moorslede_informatiegids2013-2015
 
Endüstriyel Soğutma Sistemleri / Industrial Cooling Solutions
Endüstriyel Soğutma Sistemleri / Industrial Cooling SolutionsEndüstriyel Soğutma Sistemleri / Industrial Cooling Solutions
Endüstriyel Soğutma Sistemleri / Industrial Cooling Solutions
 
Double page spread audience feedback
Double page spread audience feedbackDouble page spread audience feedback
Double page spread audience feedback
 
Deepstep Rocks the Country Club - Atlanta Country and Western Bars
Deepstep Rocks the Country Club - Atlanta Country and Western BarsDeepstep Rocks the Country Club - Atlanta Country and Western Bars
Deepstep Rocks the Country Club - Atlanta Country and Western Bars
 
Abdul rahman cv
Abdul rahman cvAbdul rahman cv
Abdul rahman cv
 
Death of Karen Carpenter - InfoBarrel
Death of Karen Carpenter - InfoBarrelDeath of Karen Carpenter - InfoBarrel
Death of Karen Carpenter - InfoBarrel
 
Stephen Kazman - School Trustee Ward 5
Stephen Kazman - School Trustee Ward 5Stephen Kazman - School Trustee Ward 5
Stephen Kazman - School Trustee Ward 5
 
De haan wegwijs-2014
De haan wegwijs-2014De haan wegwijs-2014
De haan wegwijs-2014
 
Contents audience feedback
Contents audience feedbackContents audience feedback
Contents audience feedback
 
Neuer produkt katalog_2016
Neuer produkt katalog_2016Neuer produkt katalog_2016
Neuer produkt katalog_2016
 

Similar to Ontologies Ontop Databases

Similar to Ontologies Ontop Databases (20)

Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
SFSCON23 - Michele Finelli - Management of large genomic data with free software
SFSCON23 - Michele Finelli - Management of large genomic data with free softwareSFSCON23 - Michele Finelli - Management of large genomic data with free software
SFSCON23 - Michele Finelli - Management of large genomic data with free software
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
ICWE2017 BigDataEurope
ICWE2017 BigDataEuropeICWE2017 BigDataEurope
ICWE2017 BigDataEurope
 
1 1 anatomy of an app
1 1 anatomy of an app1 1 anatomy of an app
1 1 anatomy of an app
 
A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
A scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materializationA scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materialization
 
Reproducible Science and Deep Software Variability
Reproducible Science and Deep Software VariabilityReproducible Science and Deep Software Variability
Reproducible Science and Deep Software Variability
 
Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware AnalysisLO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
 
Reproducibility: 10 Simple Rules
Reproducibility: 10 Simple RulesReproducibility: 10 Simple Rules
Reproducibility: 10 Simple Rules
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Ontologies Ontop Databases

  • 1. Ontologies Ontop Databases Martin Rezk (PhD) Faculty of Computer Science Free University of Bozen-Bolzano, Italy Semantic Web Applications and Tools for Life Sciences 9/12/MMXIV 1 / 61
  • 2. Who am I? Postdoc at: Free University of Bozen-Bolzano, Bolzano, Italy. Research topics: Ontology-based Data Access (OBDA), Efficient Query Answering, Query rewriting, Data integration. Leader of the -ontop- project. 2 / 61
  • 3. What are we going to learn today? How to organize and access your data using ontologies. How to do it with our system: –ontop–. How to use this approach for data integration and consistency checking. 3 / 61
  • 4. Declaimer Part of the material here has been taken from a tutorial by Mariano Rodriguez (2011) http://www.slideshare.net/marianomx/ontop-a-tutorial 4 / 61
  • 5. Declaimer: License License This work is supported by the Optique Project and is licensed under the Creative Commons Attribution-Share Alike 3.0 License http://creativecommons.org/licenses/by-sa/3.0/ 5 / 61
  • 6. Overview Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 6 / 61
  • 7. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 7 / 61
  • 8. Ontop and Optique: Why? 8 / 61
  • 9. When does this Go Wrong? I’m sure the information is there but there are so many concepts involved that I can’t find it in the application. I can’t say what I want to the application. The application doesn’t know about this data. They Get Lost !! 9 / 61
  • 10. How they Tackle This Problem Engineer information need specialised Application IT-expert query answers 10 / 61
  • 11. How they Tackle This Problem May take weeks to respond. Takes several years to master data stores and user needs. Siemens loses up to 50.000.000 euro per year because of this!! 11 / 61
  • 13. Optique Provides a semantic end-to-end connection between users and data sources; Enables users to rapidly formulate intuitive queries using familiar vocabularies and conceptualisations. 13 / 61
  • 14. How Optique Tackles the Problem Velocity Volume BIG DATA Variety Complexity (i) Providing scalable big/fast data infrastructures. (ii) Providing the ability to cope with variety and complexity in the data. (iii) Providing a good understanding of the data. 14 / 61
  • 15. How Optique Tackles the Problem Velocity Volume BIG DATA Variety Complexity (i) Providing scalable big/fast data infrastructures. (ii) Providing the ability to cope with variety and complexity in the data. (OBDA) (iii) Providing a good understanding of the data. (OBDA) 14 / 61
  • 16. -ontop-: Our OBDA System -ontop- is a platform to query databases as virtual RDF Graphs using SPARQL, and exploiting OWL 2 QL ontologies and mappings It is currently being developed in the context of the Optique project. Development of -ontop- started 4.5 years ago. -ontop- is already well established website has +-500 visits per month got 500 new registrations in the past year -ontop- is open-source and released under Apache license. We support all major relational DBs (Oracle, DB2, Postgres, MySQL, etc.) 15 / 61
  • 17. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 16 / 61
  • 18. Ontology-based data access (OBDA) Ontology-based Data Access Ontology Source Source Source Mapping Queries Data source(s): are external and independent (possibly multiple and heterogeneous). Ontology: provides a unified common vocabulary, and a conceptual view of the data. Mappings: relate each term in the ontology to a set of (SQL) views over (possibly federated) data. 17 / 61
  • 19. Simple Life Science Running Example Data source(s): Hospital Databases with cancer patients (First 1, then 2) Ontology: A common domain vocabulary defining Patient, Cancer, LungCancer, etc. Mappings: Relating the vocabulary and the databases. 18 / 61
  • 20. Pre-requisites Before we start you need: Java 1.7 (check typing: java -version) The material online: https://www.dropbox.com/sh/316cavgoavjtnu3/ AAADoau5NuNGq3zXO4JJQ6rya?dl=0 19 / 61
  • 21. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 20 / 61
  • 22. DB Engines and SQL Standard way to store LARGE volumes of data: Mature, Robust and FAST. Domain is structured as tables, data becomes rows in these tables. Powerful query language (SQL) to retrieve this data. Major companies developed SQL DBs for the last 30 years (IBM, Microsoft, Oracle).. 21 / 61
  • 23. Data Source Cancer Patient Database 1 Table: tbl_patient PatientId Name Type Stage 1 Mary false 4 2 John true 7 22 / 61
  • 24. Data Source Cancer Patient Database 1 Table: tbl_patient PatientId Name Type Stage 1 Mary false 4 2 John true 7 Type is: false for Non-Small Cell Lung Cancer (NSCLC) true for Small Cell Lung Cancer (SCLC) Stage is: 1-6 for NSCLC stages: I,II,III,IIIa,IIIb,IV 7-8 for SCLC stages: Limited,Extensive 22 / 61
  • 25. Creating the DB in H2 H2 is a pure java SQL database Just unzip the downloaded package Easy to run, just run the scripts: Open a terminal (in mac Terminal.app, in windows run cmd.exe) Move to the H2 folder (e.g., cd h2) Start H2 using the h2 scripts sh h2.sh (in mac/linux - You might need “chmod u+x h2.sh ”) h2w.bat (in windows) 23 / 61
  • 26. How it looks: jdbc:h2:tcp: = protocol information locahost = server location helloworld= database name 24 / 61
  • 27. How to access it from the web 25 / 61
  • 28. Creating the table You can use the files create.sql and insert.sql CREATE TABLE "tbl_patient" ( patientid INT NOT NULL PRIMARY KEY, name VARCHAR(40), type BOOLEAN, stage TINYINT ) Adding Data: INSERT INTO "tbl_patient" (patientid,name,type,stage) VALUES (1,'Mary',false,4), (2,'John',true,7); 26 / 61
  • 29. Example SQL Query Patients with type false and stage IIIa or above (select.sql) SELECT patientid FROM "tbl_patient" WHERE TYPE = false AND stage >= 4 27 / 61
  • 30. The More Meaningful Question Give me the id and the name of the patients with a tumor at stage IIIa 28 / 61
  • 31. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 29 / 61
  • 32. Ontologies: Conceptual view Definition An artifact that contains a vocabulary, relations between the terms in the vocabulary, and that is expressed in a language whose syntax and semantics (meaning of the syntax) are shared and agreed upon. Mike type Patient NSCLC subClassOf LungCancer LungCancer subClassOf Cancer 30 / 61
  • 33. Ontologies: Protege Go to the protégé-ontop folder from your material. This is a Protégé 4.3 package that includes the ontop plugin Run Protégé from the console using the run.bat or run.sh scripts. That is, execute: cd Protege_4.3/; run.sh 31 / 61
  • 34. The ontology: Creating Concepts and Properties Add the concept: Patient. (See PatientOnto.owl) 32 / 61
  • 35. The ontology: Creating Concepts and Properties Add these object properties: (See PatientOnto.owl) 32 / 61
  • 36. The ontology: Creating Concepts and Properties Add these data properties: (See PatientOnto.owl) 32 / 61
  • 37. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 33 / 61
  • 38. Mappings We have the vocabulary, the database, now we need to link those two. Mappings define triples (subject, property, object) out of SQL queries. These triples is accessible during query time (the on-the-fly approach) or can be imported into the OWL ontology (the ETL approach) Definition (Intuition) A mapping have the following form: TripleTemplate SQL Query to build the triples represents triples constructed from each result row returned by the SQL query in the mapping. 34 / 61
  • 39. The Mappings p.Id Name Type Stage 1 Mary false 2 (:db1/{p.id},type, :Patient) Select p.id From tbl_patient (:db1/{p.id},:hasName, {name}) Select p.id,name From tbl_patient (:db1/{p.id},:hasNeoplasm, :db1/neoplasm/{p.id}) Select p.id From tbl_patient (:db1/neoplasm/{p.id},:hasStage, :stage-IIIa) Select p.id From tbl_patient where stage=4 35 / 61
  • 40. The Mappings p.Id Name Type Stage 1 Mary false 2 (:db1/{p.id},type, :Patient) Select p.id From tbl_patient (:db1/{p.id},:hasName, {name}) Select p.id,name From tbl_patient (:db1/{p.id},:hasNeoplasm, :db1/neoplasm/{p.id}) Select p.id From tbl_patient (:db1/neoplasm/{p.id},:hasStage, :stage-IIIa) Select p.id From tbl_patient where stage=4 35 / 61
  • 41. The Mappings Using the Ontop Mapping tab, we now need to define the connection parameters to our lung cancer database Steps: 1. Switch to the Ontop Mapping tab 2. Add a new data source (give it a name, e.g., PatientDB) 3. Define the connection parameters as follows: Connection URL: jdbc:h2:tcp://localhost/helloworld Username: sa Password: (leave empty) Driver class: org.h2.Driver (choose it from the drop down menu) 4. Test the connection using the “Test Connection” button 36 / 61
  • 43. The Mappings Switch to the “Mapping Manager” tab in the ontop mappings tab. Select your datasource click Create: target: :db1/{patientid} a :Patient . source: SELECT patientid FROM "tbl_patient" target: :db1/{patientid} :hasName {name} . source: Select patientid,name FROM "tbl_patient" target: :db1/{patientid} :hasNeoplasm :db1/neoplasm/{patientid}. source: SELECT patientid FROM "tbl_patient" target: :db1/neoplasm/{patientid} :hasStage :stage-IIIa . source: SELECT patientid FROM "tbl_patient" where stage=4 38 / 61
  • 44. The Mappings Now we classify the neoplasm individual using our knowledge of the database. We know that “false” in the table patient indicates a “Non Small Cell Lung Cancer”, so we classify the patients as a :NSCLC. 39 / 61
  • 45. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 40 / 61
  • 46. Virtual Graph Data: The vocabulary is more domain oriented and independent from the DB. No more values to encode types or stages. Later, this will allow us to easily integrate new data or domain information (e.g., an ontology). Our data sources are now documented!. 41 / 61
  • 47. Virtual Graph Data and Inference: There is a new individual :db1/neoplasm/1 that stands for the cancer (tumor) of Mary. This allows the user to query specific properties of the tumor independently of the patient. We get extra information as shown above. 41 / 61
  • 48. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 42 / 61
  • 49. On-the-fly access to the DB Recall our information need: Give me the id and the name of the patients with a tumor at stage IIIa. Enable Ontop in the “Reasoner” menu 43 / 61
  • 50. On-the-fly access to the DB In the ontop SPARQL tab add all the prefixes 44 / 61
  • 51. On-the-fly access to the DB Write the SPARQL Query SELECT ?p ?name WHERE { ?p rdf:type :Patient . ?p :hasName ?name . ?p :hasNeoplasm ?tumor . ?tumor :hasStage :stage-IIIa .} Click execute This is the main way to access data in ontop and its done by querying ontop with SPARQL. 45 / 61
  • 52. How we do Inference... We embed inference into the query We do not need to reason with the (Big) Data 46 / 61
  • 53. Standards Ontology languages: RDF, RDFS, OWL (W3C recommendations) Query: SQL, SPARQL Mappings: R2RML (W3C recommendation) 47 / 61
  • 54. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 48 / 61
  • 55. What about Data Integration? Cancer Patient Database 2 T_Name PId Nombre 1 Anna 2 Mike DB information is distributed in multiple tables. The IDs of the two DBs overlap. T_NSCLC Id hosp Stge 1 X two 2 Y one Information is encoded differently. E.g. Stage of cancer is text (one, two...) T_SCLC key hosp St 1 XXX 2 YYY 49 / 61
  • 58. New Mappings The URI’s for the new individuals differentiate the data sources (db2 vs. db1) Being an instance of NSCLC and SCLC depends now on the table, not a column value 50 / 61
  • 59. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 51 / 61
  • 60. Consistency A logic based ontology language, such as OWL, allows ontologies to be specified as logical theories, this implies that it is possible to constrain the relationships between concepts, properties, and data. In OBDA inconsistencies arise when your mappings violate the constraints imposed by the ontology. In OBDA we have two types of constraints: Disjointness: The intersection between classes Patient and Employee should be empty. There can also be disjoint properties. Functional Properties: Every patient has at most one name. 52 / 61
  • 61. Consistency: Setting up a Constraint 53 / 61
  • 62. Consistency: Building a wrong mapping 54 / 61
  • 64. Consistency: Finding out the Problem 56 / 61
  • 65. Outline Introduction: Optique and Ontop Ontology Based Data Access The Database: Ontologies Mappings Virtual Graph Querying Data Integration Checking Consistency Conclusions 57 / 61
  • 66. Conclusions Ontologies give you a common vocabulary to formulate the queries, and mappings to find the answers. Ontologies and Semantic Technology can help to handle the problem of accessing Big Data Diversity: Using ontologies describing particular domains allows to hide the storage complexity. Agreement on data identifiers allows for integration of datasets. Understanding: Agreement on vocabulary allow to better define your data and allows for easy information exchange. There is no need of computationally expensive ETL processes. Reasoning is scalable because we reason at the query level. You do not need to have everything ready to use it! 58 / 61
  • 67. What I left outside this talk... Semantic Query Optimisation SWRL and Recursion Performance Evaluation Aggregates and bag semantics Give out about Database Engines Tons of theory etc. etc. etc... 59 / 61
  • 69. Extra: Where Reasoning takes Place If we pose the query asking for all the instances of the class Neoplasm: SELECT ?x WHERE { ?x rdf:type :Noeplasms). } 61 / 61
  • 70. Extra: Where Reasoning takes Place If we pose the query asking for all the instances of the class Neoplasm: SELECT ?x WHERE { ?x rdf:type :Noeplasms). } (Intuitively) -ontop- will translate it into: SELECT ?x WHERE { { ?x rdf:type :Neoplasms. } UNION { ?x rdf:type :BenignNeoplasms. } UNION { ?x rdf:type :MalignantNeoplasm. } UNION ... { ?x rdf:type :NSCLC). } UNION { ?x rdf:type :SCLC). } } 61 / 61
  • 71. Extra: Where Reasoning takes Place If we pose the query asking for all the instances of the class Neoplasm: SELECT ?x WHERE { ?x rdf:type :Noeplasms). } (Intuitively) -ontop- will translate it into: SELECT ?x WHERE { {?x rdf:type :Neoplasms.} UNION { ?x rdf:type :BenignNeoplasms. } UNION { ?x rdf:type :MalignantNeoplasm . } UNION ... { ?x rdf:type :NSCLC). } UNION { ?x rdf:type :SCLC). } } 61 / 61
  • 72. Extra: Where Reasoning takes Place If we pose the query asking for all the instances of the class Neoplasm: SELECT ?x WHERE { ?x rdf:type :Noeplasms). } (Intuitively) -ontop- will translate it into: SELECT Concat(:db1/neoplasm/, TBL.PATIENT.id) AS ?x FROM TBL.PATIENT 61 / 61