SlideShare a Scribd company logo
Ontology-Based Data
Access: Why It is So Cool!
Josef Hardi
josef.hardi@stanford.edu
September 4, 2015
Ontology-Based Data Access is a concept developed by Diego Calvanese and
Mariano Rodriguez-Muro in KRDB Research Centre at Free University of Bozen-
Bolzano
Outline
● What is Ontology-based Data Access, or OBDA?
○ Motivation
○ System Black Box
○ Process Illustration
● Project -ontop- and Quest
● Experiment
○ Query Answering Performance
○ -ontop- vs Semantika
● Conclusion
● Q&A
Acknowledgement
Parts of the slides in this presentation are taken from
tutorial or lecture slides by:
Diego Calvanese,
Mariano Rodriguez-Muro, and
Martin Rezk
What is….
Ontology-based Data Access?
Think a scenario
Data Layer
Data Service
conceptual view
Image source: (various sources)
What is Ontology-based Data Access?
Data Access Bottleneck
Image source: Rezk, Martin. Ontologies Ontop Databases http://www.slideshare.net/MartnRezk/slides-swat4-ls
What is Ontology-based Data Access?
Query Answering
tbl_patient+2015
PatientId Name Cell_type cStage
1 Mary true 7
2 John false 6
3 Bill false 4
Cancer type is:
● NSCLC is when Cell_type is
false,
● SCLC is when Cell_type is
true.
Cancer stage is:
● I, II, III, IIIa, IIIb, IV for
NSCLC, corr. cStage: 1 - 6,
● Limited and Extensive for
SCLC, corr. cStage: 7 and 8.
There is “hidden logic” inside
the table that is specifically
used by the application. Not
for querying the data!
Query Answering
tbl_patient+2015
PatientId Name Cell_type cStage
1 Mary true 7
2 John false 6
3 Bill false 4
Name cStage
John 6
Bill 4
RESULT
select Name, cStage
from tbl_patient+2015
where Cell_type = false
and cStage >= 4;
Can we do it better?
Show me all the patients’ name and stage
status that have large tumor with at least in
a minimum stage IIIa.
Query Answering
Bridge the semantics
tbl_patient+2015
PatientId Name Cell_type cStage
1 Mary true 7
2 John false 6
3 Bill false 4
Cancer type is:
● NSCLC is when Cell_type is
false,
● SCLC is when Cell_type is
true.
Cancer stage is:
● I, II, III, IIIa, IIIb, IV for
NSCLC,
● Limited and Extensive for
SCLC.
hasStage
ISA
name
ISA
ISA
hasNeoplasm
SNOMED-CT
*SCLC = Small Cell Lung Cancer, NSCLC = Non-Small Cell Lung Cancer
Query Answering
OBDA Answering
● (Data) Sources: represents the external and independent
resources. Existing organization assets.
● Ontology: provides a unified common vocabulary. The
conceptual view of the underlying data
● Mappings: relates the terms in ontology to a set of SQL
views.
Image source: Rezk, Martin. Ontologies Ontop Databases http://www.slideshare.net/MartnRezk/slides-swat4-ls
Query Answering
OBDA Answering Black Box
● Rewriting: Create a new query which is the expanded
version of the original query, using all the defined
inclusion assertions in the ontology.
● Unfolding: Substitute each part in the expanded query
with corresponding SQL views from the given mappings.
● Evaluation: Execute the complete SQL to a target RDBMS.
Image source: Kontchakov, Roman, et.al. Ontology-based Data Access: Ontop of Databases. http://www.dcs.bbk.ac.uk/~roman/papers/ISWC13.pdf
Query Answering
OBDA Answering Illustration
Q: Show me all the Person in the hospital?
Q’: Show me
all the Person UNION
all the Nurse UNION
all the Doctor UNION
all the Patient UNION
anyone who has
Neoplasm in the hospital?
Rewritten
Look where is the source(s)
(No source)
Q’: Show me
all the Person UNION
all the Nurse UNION
all the Doctor UNION
all the Patient UNION
anyone who has
Neoplasm
in the hospital?
Get the list from table Nurse
Get the list from table Doctor
Get the list from table Patient
Get the list from table Cancer
Patient 2015
M
M
M
M
M
OBDA Answering Illustration
Substitute with SQL views
Q’: Show me
all the Person UNION
select NurseId from tbl_nurse UNION
select doc_id from tbl_doctor UNION
select pid from tbl_patient UNION
select PatientId from tbl_patient+2015
in the hospital?
OBDA Answering Illustration
Unfolded
Execute the SQL
select NurseId from tbl_nurse
UNION
select doc_id from tbl_doctor
UNION
select pid from tbl_patient
UNION
select PatientId from tbl_patient+2015
OBDA Answering Illustration
Evaluated
42!
(Computational) Price to Pay
Query answering in OBDA setting:
● PTIME in the size of ontology (efficiently
tractable)
● AC0
in the size of the data (very efficiently
tractable)
● NP-Complete in the size of query
(exponential)
*Tractable problem: there exists an algorithm that will eventually terminate in a
reasonable amount of time and return you the result.
OBDA Answering Illustration
-ontop- Project
● A platform to query relational databases using
SPARQL language,
● The implementation started in 2010,
● Supports several database systems, like: MySQL,
PostgreSQL, H2, SQL Server, Oracle, IBM DB2.
● Distributed under open-source license.
● It is currently being developed within the context of
EU Optique project.
● Fantastic add-ons: Efficient rewriting, Query
optimization, Transitive query, Rules entailment,
Cross-linked datasets.
-ontop-
-ontop- for Protege
http://ontop.inf.unibz.it/
-ontop-
Experiment
Semantika Project
http://obidea.com/semantika/
Experiment
Berlin SPARQL Benchmark (BSBM)
● A benchmark suite built around e-commerce
domain.
○ A set of products is offered by different vendors and
customers are posting product reviews.
● Consists of 12 different queries, emulating
the search and navigation pattern of a
consumer looking for a product.
● A Query-Mix consists of 25 querying actions
that simulate a product search scenario.
● No inference.
Experiment
BSBM-100
● Dataset of 100 million triples,
● Transformed into relational db schema:
offer > 5.7 million rows
person > 147 thousand rows
producer > 5 thousand rows
product > 288 thousand rows
productfeature > 47 thousand rows
productfeatureproduct > 5.5 million rows
producttype > 2 thousand rows
producttypeproduct > 1.4 million rows
review > 2.8 million rows
vendor > 2 thousand rows
Experiment
Test Databases
● MySQL - v5.6
○ Vanilla
○ Optimized
■ CREATE INDEX
■ OPTIMIZE TABLE - ANALYZE
● PostgreSQL - v9.4.4
○ Vanilla
○ Optimized
■ CREATE INDEX
■ VACUUM TABLE - ANALYZE
Experiment
Test Machine
● MacBook Pro
○ OS X Yosemite 64-bit
○ Java 8 (build 1.8.0_51-b16)
○ Intel Core i7 3 GHz
○ Memory 16 GB
○ Flash storage
○ Direct connection - no network cost
Experiment
Benchmark Flow
for each obda-endpoint do:
for each dbms do:
for each dbms-variant do:
start endpoint;
start dbms;
loop 2:
run ‘benchmark -runs 100 -w 10’;
stop dbms;
stop endpoint;
Experiment
Benchmark Result
Experiment
Conclusion
● OBDA offers a non-invasive solution to
existing (legacy) database system for
better data access service.
● A lot of interesting topics can be harvested
from OBDA use case scenarios.
○ Health and clinical domain perhaps?
● OBDA performance relies heavily on the
efficiency of the underlying data
infrastructure (both HW and SW).
Thanks! Any Questions?
Appendix:
Query Answering and
Query Rewriting
Query Answering over Database
Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
An example
Query Answering over Ontology
Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
An example
Query Answering via Rewriting
Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
Query Rewriting
Appendix:
-ontop- Add-ons
-ontop- Black Box
Image source: Kontchakov, Roman, et.al. Ontology-based Data Access: Ontop of Databases. http://www.dcs.bbk.ac.uk/~roman/papers/ISWC13.pdf
● Tree witness rewriting technique
● T-mapping optimization
● Semantic Query Optimization (SQO)
Rule Entailment
Image source: Xiao, Guohui, et.al. Rules and Ontology-based Data Access. https://www.inf.unibz.it/~calvanese/papers/xiao-rezk-rodr-calv-RR-2014.pdf
● SWRL Rules to relational algebra, expressed in SQL’99
Common Table Expressions (CTEs)
● T-Mapping extension
Appendix:
Detailed Benchmark
Report
Query-Mixed per Hour
-ontop- Semantika Native
MySQL 807 831 436
MySQL optimized 1,471 1,630 2,371
PostgreSQL 2,198 2,286 418
PostgreSQL optimized 7,576 9,204 15,500
Query per Second - MySQL
Vanilla
Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12
-ontop- 1 95 1 -- 1 88 100 -- 75 -- --
Semantika 1 101 1 -- 1 77 112 -- 95 -- --
Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12
-ontop- 30 73 26 -- 1 48 63 -- 49 -- --
Semantika 58 99 46 -- 1 95 108 -- 102 -- --
Optimized
Query per Second - PostgreSQL
Vanilla
Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12
-ontop- 4 89 4 -- 2 73 77 -- 100 -- --
Semantika 4 90 4 -- 2 96 110 -- 123 -- --
Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12
-ontop- 75 77 79 -- 9 47 60 -- 76 -- --
Semantika 88 81 82 -- 9 94 110 -- 119 -- --
Optimized
Semantika does cache better
-ontop- Semantika
Trial 1 Trial 2 Delta% Trial 1 Trial 2 Delta%
MySQL 790 807 +2% 638 831 +30%
MySQL optimized 1424 1471 +3% 983 1630 +66%
PostgreSQL 1803 2198 +22% 1254 2286 +82%
PostgreSQL optimized 5678 7576 +33% 2028 9204 +354%
Ontop could answer ALL queries
Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12
-ontop- 83 80 78 112 9 75 78 83 105 91 83
Semantika 88 81 82 -- 9 94 110 -- 119 -- --
-ontop- supports almost all features in SPARQL 1.1
Appendix:
Comparison: Mapping
Syntax
-ontop- Mappings
mappingId Reviewer
target <"&bsbm-inst;dataFromRatingSite{$publisher}/Reviewer{$nr}"> a foaf:Person;
foaf:name $name; foaf:mbox_sha1sum $mbox_sha1sum; bsbm:country <"&iso3166;{$country}"
>; dc:publisher <"&bsbm-inst;dataFromRatingSite{$publisher}/RatingSite{$publisher}">; dc:date
$publishDate .
source select nr, name, mbox_sha1sum, country, publisher, publishDate from person
mappingId Producer
target <"&bsbm-inst;dataFromProducer{$nr}/Producer{$nr}"> a bsbm:Producer; rdfs:
label $label; rdfs:comment $comment; foaf:homepage $homepage; bsbm:country <"&iso3166;
{$country}">; dc:publisher <"&bsbm-inst;dataFromProducer{$nr}/Producer{$nr}">; dc:date
$publishDate .
source select nr, label, comment, homepage, country, publisher, publishDate from
producer
● Uses Turtle syntax.
● Specification: https://babbage.inf.unibz.
it/trac/obdapublic/wiki/ObdalibObdaTurtlesyntax
● Support R2RML syntax
Semantika Mappings
<mapping tml:id="Reviewer">
<logical-table rr:tableName="person"/>
<subject-map rr:class="foaf:Person" rr:template="Reviewer(publisher,nr)"/>
<predicate-object-map rr:predicate="foaf:name" rr:column="name"/>
<predicate-object-map rr:predicate="foaf:mbox_sha1sum" rr:column="mbox_sha1sum"/>
<predicate-object-map rr:predicate="bsbm:country" rr:template="Country(country)"/>
<predicate-object-map rr:predicate="dc:publisher" rr:template="ReviewerPublisher(publisher,publisher)"/>
<predicate-object-map rr:predicate="dc:date" rr:column="publishDate"/>
</mapping>
<mapping tml:id="Producer">
<logical-table rr:tableName="producer"/>
<subject-map rr:class="bsbm:Producer" rr:template="Producer(nr,nr)"/>
<predicate-object-map rr:predicate="rdfs:label" rr:column="label"/>
<predicate-object-map rr:predicate="rdfs:comment" rr:column="comment"/>
<predicate-object-map rr:predicate="foaf:homepage" rr:column="homepage"/>
<predicate-object-map rr:predicate="bsbm:country" rr:template="Country(country)"/>
<predicate-object-map rr:predicate="dc:publisher" rr:template="ProducerPublisher(nr,nr)"/>
<predicate-object-map rr:predicate="dc:date" rr:column="publishDate"/>
</mapping>
● Uses XML format.
● Specification: https://github.com/obidea/semantika/wiki/2.-Basic-RDB-RDF-
Mapping
● Support R2RML syntax
Appendix:
Comparison: SQL
Creation
Simple SPARQL Query
SELECT ?title ?publishDate
WHERE
{ ?review bsbm:reviewFor bsbm:Producer1245/Product62033> .
?review dc:title ?title .
?review dc:date ?publishDate .
}
Ontop SQL Creation
SELECT
3 AS `titleQuestType`, NULL AS `titleLang`, QVIEW1.`title` AS `title`,
10 AS `publishDateQuestType`, NULL AS `publishDateLang`, CAST
(QVIEW1.`publishDate` AS CHAR(8000) CHARACTER SET utf8) AS
`publishDate`
FROM review QVIEW1
WHERE
(QVIEW1.`product` = '62033') AND
(QVIEW1.`producer` = '1245') AND
QVIEW1.`publisher` IS NOT NULL AND
QVIEW1.`nr` IS NOT NULL AND
QVIEW1.`title` IS NOT NULL AND
QVIEW1.`publishDate` IS NOT NULL
Semantika SQL Creation
SELECT `OBDA_VIEW1`.`title` AS `title`,
`OBDA_VIEW1`.`publishDate` AS `publishDate`
FROM `bsbm100`.`review` AS `OBDA_VIEW1`
WHERE `OBDA_VIEW1`.`publisher` IS NOT NULL AND
`OBDA_VIEW1`.`product` = 62033 AND
`OBDA_VIEW1`.`publishDate` IS NOT NULL AND
`OBDA_VIEW1`.`nr` IS NOT NULL AND
`OBDA_VIEW1`.`title` IS NOT NULL AND
`OBDA_VIEW1`.`producer` = 1245
Let’s add something more...
SELECT ?review ?title ?publishDate ?rating1 ?rating2
WHERE
{ ?review bsbm:reviewFor bsbm:Producer1245/Product62033> .
?review dc:title ?title .
?review dc:date ?publishDate .
?review bsbm:rating1 ?rating1 .
OPTIONAL { ?review bsbm:rating2 ?rating2 . }
}
Ontop SQL Creation
SELECT
1 AS `reviewQuestType`, NULL AS `reviewLang`, CONCAT('http://www4.wiwiss.fu-berlin.
de/bizer/bsbm/v01/instances/dataFromRatingSite', REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(CAST(QVIEW1.`publisher` AS CHAR
(8000) CHARACTER SET utf8),' ', '%20'),'!', '%21'),'@', '%40'),'#', '%23'),'$', '%24'),'&', '%26'),'*', '%42'), '(', '%28'), ')', '%29'), '[', '%5B'), ']', '%5D'),
',', '%2C'), ';', '%3B'), ':', '%3A'), '?', '%3F'), '=', '%3D'), '+', '%2B'), '''', '%22'), '/', '%2F'), '/Review', REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(CAST(QVIEW1.`nr` AS CHAR(8000) CHARACTER SET utf8),' ', '%20'),'!', '%21'),'@', '%40'),'#', '%23'),'$', '%24'),'&', '%26'),'*', '%42'), '(', '%28'),
')', '%29'), '[', '%5B'), ']', '%5D'), ',', '%2C'), ';', '%3B'), ':', '%3A'), '?', '%3F'), '=', '%3D'), '+', '%2B'), '''', '%22'), '/', '%2F')) AS `review`,
3 AS `titleQuestType`, NULL AS `titleLang`, QVIEW1.`title` AS `title`,
10 AS `publishDateQuestType`, NULL AS `publishDateLang`, CAST(QVIEW1.`publishDate` AS CHAR(8000) CHARACTER SET utf8) AS
`publishDate`,
4 AS `rating1QuestType`, NULL AS `rating1Lang`, CAST(QVIEW1.`rating1` AS CHAR(8000) CHARACTER SET utf8) AS `rating1`,
4 AS `rating2QuestType`, NULL AS `rating2Lang`, CAST(QVIEW2.`rating2` AS CHAR(8000) CHARACTER SET utf8) AS `rating2`
FROM (
review QVIEW1
LEFT OUTER JOIN review QVIEW2
ON (QVIEW1.`nr` = QVIEW2.`nr`) AND
(QVIEW1.`publisher` = QVIEW2.`publisher`) AND
QVIEW2.`rating2` IS NOT NULL AND
QVIEW1.`publisher` IS NOT NULL AND
QVIEW1.`nr` IS NOT NULL
)
WHERE
QVIEW1.`title` IS NOT NULL AND
QVIEW1.`nr` IS NOT NULL AND
QVIEW1.`publishDate` IS NOT NULL AND
(QVIEW1.`product` = '62033') AND
QVIEW1.`publisher` IS NOT NULL AND
QVIEW1.`rating1` IS NOT NULL AND
(QVIEW1.`producer` = '1245')
Semantika SQL Creation
SELECT CONCAT('http://www4.wiwiss.fu-berlin.
de/bizer/bsbm/v01/instances/dataFromRatingSite{1}/Review{2}',' : ','"',
`OBDA_VIEW1`.`publisher`,'" "',`OBDA_VIEW1`.`nr`,'"') AS `review`,
`OBDA_VIEW1`.`title` AS `title`,
`OBDA_VIEW1`.`publishDate` AS `publishDate`,
`OBDA_VIEW1`.`rating1` AS `rating1`,
`OBDA_VIEW1`.`rating2` AS `rating2`
FROM `bsbm100_optimized`.`review` AS `OBDA_VIEW1`
WHERE `OBDA_VIEW1`.`publisher` IS NOT NULL AND
`OBDA_VIEW1`.`product` = 62033 AND
`OBDA_VIEW1`.`publishDate` IS NOT NULL AND
`OBDA_VIEW1`.`nr` IS NOT NULL AND
`OBDA_VIEW1`.`title` IS NOT NULL AND
`OBDA_VIEW1`.`rating1` IS NOT NULL AND
`OBDA_VIEW1`.`producer` = 1245

More Related Content

What's hot

2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf OpenflydataJun Zhao
 
Analysis of the “KDD Cup-1999” Datasets
Analysis of the  “KDD Cup-1999”  DatasetsAnalysis of the  “KDD Cup-1999”  Datasets
Analysis of the “KDD Cup-1999” Datasets
Rafsanjani, Muhammod
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
Valery Tkachenko
 
Text Mining using LDA with Context
Text Mining using LDA with ContextText Mining using LDA with Context
Text Mining using LDA with Context
Steffen Staab
 
ECMFA 2016 slides
ECMFA 2016 slidesECMFA 2016 slides
ECMFA 2016 slides
Antonio García-Domínguez
 
Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesMarkus Scheidgen
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
University of Washington
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
Dan Sullivan, Ph.D.
 
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
Sease
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Valery Tkachenko
 
Reference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based DatasetsReference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based Datasets
Markus Scheidgen
 
Text mining meets neural nets
Text mining meets neural netsText mining meets neural nets
Text mining meets neural nets
Dan Sullivan, Ph.D.
 
OntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KPOntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KP
Aksw Group
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils FlywebJun Zhao
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
DBOnto
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
Gaignard Alban
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
Trey Grainger
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013Luis Daniel Ibáñez
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
Aksw Group
 

What's hot (19)

2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata
 
Analysis of the “KDD Cup-1999” Datasets
Analysis of the  “KDD Cup-1999”  DatasetsAnalysis of the  “KDD Cup-1999”  Datasets
Analysis of the “KDD Cup-1999” Datasets
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
 
Text Mining using LDA with Context
Text Mining using LDA with ContextText Mining using LDA with Context
Text Mining using LDA with Context
 
ECMFA 2016 slides
ECMFA 2016 slidesECMFA 2016 slides
ECMFA 2016 slides
 
Model-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software RepositoriesModel-based Analysis of Large Scale Software Repositories
Model-based Analysis of Large Scale Software Repositories
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
 
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
Improving Top-K Retrieval Algorithms Using Dynamic Programming and Longer Ski...
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
Reference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based DatasetsReference Representation in Large Metamodel-based Datasets
Reference Representation in Large Metamodel-based Datasets
 
Text mining meets neural nets
Text mining meets neural netsText mining meets neural nets
Text mining meets neural nets
 
OntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KPOntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KP
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils Flyweb
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
 

Similar to Ontology-based data access: why it is so cool!

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena
 
Big (chemical) data? No Problem!
Big (chemical) data? No Problem!Big (chemical) data? No Problem!
Big (chemical) data? No Problem!
Greg Landrum
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
Claire Le Goues
 
Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...
Ana Roxin
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
SQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query setSQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query set
Kognitio
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Alluxio, Inc.
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
ISSEL
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
Omid Vahdaty
 
Big Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and AnalyticsBig Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and Analytics
OPNFV
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
Paul Groth
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
prashant 100702007
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
Cody Rioux
 
Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesRui Vieira
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Databricks
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
SigOpt
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
Stephen Aylward
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
DataStax
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Herman Wu
 

Similar to Ontology-based data access: why it is so cool! (20)

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
Big (chemical) data? No Problem!
Big (chemical) data? No Problem!Big (chemical) data? No Problem!
Big (chemical) data? No Problem!
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...Querying and reasoning over large scale building datasets: an outline of a pe...
Querying and reasoning over large scale building datasets: an outline of a pe...
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
 
SQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query setSQL on Hadoop benchmarks using TPC-DS query set
SQL on Hadoop benchmarks using TPC-DS query set
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Big Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and AnalyticsBig Data for Testing - Heading for Post Process and Analytics
Big Data for Testing - Heading for Post Process and Analytics
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
 
Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databases
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

Ontology-based data access: why it is so cool!

  • 1. Ontology-Based Data Access: Why It is So Cool! Josef Hardi josef.hardi@stanford.edu September 4, 2015 Ontology-Based Data Access is a concept developed by Diego Calvanese and Mariano Rodriguez-Muro in KRDB Research Centre at Free University of Bozen- Bolzano
  • 2. Outline ● What is Ontology-based Data Access, or OBDA? ○ Motivation ○ System Black Box ○ Process Illustration ● Project -ontop- and Quest ● Experiment ○ Query Answering Performance ○ -ontop- vs Semantika ● Conclusion ● Q&A
  • 3. Acknowledgement Parts of the slides in this presentation are taken from tutorial or lecture slides by: Diego Calvanese, Mariano Rodriguez-Muro, and Martin Rezk
  • 5. Think a scenario Data Layer Data Service conceptual view Image source: (various sources) What is Ontology-based Data Access?
  • 6. Data Access Bottleneck Image source: Rezk, Martin. Ontologies Ontop Databases http://www.slideshare.net/MartnRezk/slides-swat4-ls What is Ontology-based Data Access?
  • 7. Query Answering tbl_patient+2015 PatientId Name Cell_type cStage 1 Mary true 7 2 John false 6 3 Bill false 4 Cancer type is: ● NSCLC is when Cell_type is false, ● SCLC is when Cell_type is true. Cancer stage is: ● I, II, III, IIIa, IIIb, IV for NSCLC, corr. cStage: 1 - 6, ● Limited and Extensive for SCLC, corr. cStage: 7 and 8. There is “hidden logic” inside the table that is specifically used by the application. Not for querying the data!
  • 8. Query Answering tbl_patient+2015 PatientId Name Cell_type cStage 1 Mary true 7 2 John false 6 3 Bill false 4 Name cStage John 6 Bill 4 RESULT select Name, cStage from tbl_patient+2015 where Cell_type = false and cStage >= 4;
  • 9. Can we do it better? Show me all the patients’ name and stage status that have large tumor with at least in a minimum stage IIIa. Query Answering
  • 10. Bridge the semantics tbl_patient+2015 PatientId Name Cell_type cStage 1 Mary true 7 2 John false 6 3 Bill false 4 Cancer type is: ● NSCLC is when Cell_type is false, ● SCLC is when Cell_type is true. Cancer stage is: ● I, II, III, IIIa, IIIb, IV for NSCLC, ● Limited and Extensive for SCLC. hasStage ISA name ISA ISA hasNeoplasm SNOMED-CT *SCLC = Small Cell Lung Cancer, NSCLC = Non-Small Cell Lung Cancer Query Answering
  • 11. OBDA Answering ● (Data) Sources: represents the external and independent resources. Existing organization assets. ● Ontology: provides a unified common vocabulary. The conceptual view of the underlying data ● Mappings: relates the terms in ontology to a set of SQL views. Image source: Rezk, Martin. Ontologies Ontop Databases http://www.slideshare.net/MartnRezk/slides-swat4-ls Query Answering
  • 12. OBDA Answering Black Box ● Rewriting: Create a new query which is the expanded version of the original query, using all the defined inclusion assertions in the ontology. ● Unfolding: Substitute each part in the expanded query with corresponding SQL views from the given mappings. ● Evaluation: Execute the complete SQL to a target RDBMS. Image source: Kontchakov, Roman, et.al. Ontology-based Data Access: Ontop of Databases. http://www.dcs.bbk.ac.uk/~roman/papers/ISWC13.pdf Query Answering
  • 13. OBDA Answering Illustration Q: Show me all the Person in the hospital? Q’: Show me all the Person UNION all the Nurse UNION all the Doctor UNION all the Patient UNION anyone who has Neoplasm in the hospital? Rewritten
  • 14. Look where is the source(s) (No source) Q’: Show me all the Person UNION all the Nurse UNION all the Doctor UNION all the Patient UNION anyone who has Neoplasm in the hospital? Get the list from table Nurse Get the list from table Doctor Get the list from table Patient Get the list from table Cancer Patient 2015 M M M M M OBDA Answering Illustration
  • 15. Substitute with SQL views Q’: Show me all the Person UNION select NurseId from tbl_nurse UNION select doc_id from tbl_doctor UNION select pid from tbl_patient UNION select PatientId from tbl_patient+2015 in the hospital? OBDA Answering Illustration Unfolded
  • 16. Execute the SQL select NurseId from tbl_nurse UNION select doc_id from tbl_doctor UNION select pid from tbl_patient UNION select PatientId from tbl_patient+2015 OBDA Answering Illustration Evaluated
  • 17. 42! (Computational) Price to Pay Query answering in OBDA setting: ● PTIME in the size of ontology (efficiently tractable) ● AC0 in the size of the data (very efficiently tractable) ● NP-Complete in the size of query (exponential) *Tractable problem: there exists an algorithm that will eventually terminate in a reasonable amount of time and return you the result. OBDA Answering Illustration
  • 18.
  • 19. -ontop- Project ● A platform to query relational databases using SPARQL language, ● The implementation started in 2010, ● Supports several database systems, like: MySQL, PostgreSQL, H2, SQL Server, Oracle, IBM DB2. ● Distributed under open-source license. ● It is currently being developed within the context of EU Optique project. ● Fantastic add-ons: Efficient rewriting, Query optimization, Transitive query, Rules entailment, Cross-linked datasets. -ontop-
  • 23. Berlin SPARQL Benchmark (BSBM) ● A benchmark suite built around e-commerce domain. ○ A set of products is offered by different vendors and customers are posting product reviews. ● Consists of 12 different queries, emulating the search and navigation pattern of a consumer looking for a product. ● A Query-Mix consists of 25 querying actions that simulate a product search scenario. ● No inference. Experiment
  • 24. BSBM-100 ● Dataset of 100 million triples, ● Transformed into relational db schema: offer > 5.7 million rows person > 147 thousand rows producer > 5 thousand rows product > 288 thousand rows productfeature > 47 thousand rows productfeatureproduct > 5.5 million rows producttype > 2 thousand rows producttypeproduct > 1.4 million rows review > 2.8 million rows vendor > 2 thousand rows Experiment
  • 25. Test Databases ● MySQL - v5.6 ○ Vanilla ○ Optimized ■ CREATE INDEX ■ OPTIMIZE TABLE - ANALYZE ● PostgreSQL - v9.4.4 ○ Vanilla ○ Optimized ■ CREATE INDEX ■ VACUUM TABLE - ANALYZE Experiment
  • 26. Test Machine ● MacBook Pro ○ OS X Yosemite 64-bit ○ Java 8 (build 1.8.0_51-b16) ○ Intel Core i7 3 GHz ○ Memory 16 GB ○ Flash storage ○ Direct connection - no network cost Experiment
  • 27. Benchmark Flow for each obda-endpoint do: for each dbms do: for each dbms-variant do: start endpoint; start dbms; loop 2: run ‘benchmark -runs 100 -w 10’; stop dbms; stop endpoint; Experiment
  • 29. Conclusion ● OBDA offers a non-invasive solution to existing (legacy) database system for better data access service. ● A lot of interesting topics can be harvested from OBDA use case scenarios. ○ Health and clinical domain perhaps? ● OBDA performance relies heavily on the efficiency of the underlying data infrastructure (both HW and SW).
  • 32. Query Answering over Database Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
  • 34. Query Answering over Ontology Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
  • 36. Query Answering via Rewriting Image source: Calvanese, Diego. Ontology-Based Data Access and Integration. https://www.essi.upc.edu/docs/slides-obda-2010-02-08
  • 39. -ontop- Black Box Image source: Kontchakov, Roman, et.al. Ontology-based Data Access: Ontop of Databases. http://www.dcs.bbk.ac.uk/~roman/papers/ISWC13.pdf ● Tree witness rewriting technique ● T-mapping optimization ● Semantic Query Optimization (SQO)
  • 40. Rule Entailment Image source: Xiao, Guohui, et.al. Rules and Ontology-based Data Access. https://www.inf.unibz.it/~calvanese/papers/xiao-rezk-rodr-calv-RR-2014.pdf ● SWRL Rules to relational algebra, expressed in SQL’99 Common Table Expressions (CTEs) ● T-Mapping extension
  • 42. Query-Mixed per Hour -ontop- Semantika Native MySQL 807 831 436 MySQL optimized 1,471 1,630 2,371 PostgreSQL 2,198 2,286 418 PostgreSQL optimized 7,576 9,204 15,500
  • 43. Query per Second - MySQL Vanilla Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12 -ontop- 1 95 1 -- 1 88 100 -- 75 -- -- Semantika 1 101 1 -- 1 77 112 -- 95 -- -- Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12 -ontop- 30 73 26 -- 1 48 63 -- 49 -- -- Semantika 58 99 46 -- 1 95 108 -- 102 -- -- Optimized
  • 44. Query per Second - PostgreSQL Vanilla Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12 -ontop- 4 89 4 -- 2 73 77 -- 100 -- -- Semantika 4 90 4 -- 2 96 110 -- 123 -- -- Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12 -ontop- 75 77 79 -- 9 47 60 -- 76 -- -- Semantika 88 81 82 -- 9 94 110 -- 119 -- -- Optimized
  • 45. Semantika does cache better -ontop- Semantika Trial 1 Trial 2 Delta% Trial 1 Trial 2 Delta% MySQL 790 807 +2% 638 831 +30% MySQL optimized 1424 1471 +3% 983 1630 +66% PostgreSQL 1803 2198 +22% 1254 2286 +82% PostgreSQL optimized 5678 7576 +33% 2028 9204 +354%
  • 46. Ontop could answer ALL queries Q1 Q2 Q3 Q4 Q5 Q7 Q8 Q9 Q10 Q11 Q12 -ontop- 83 80 78 112 9 75 78 83 105 91 83 Semantika 88 81 82 -- 9 94 110 -- 119 -- -- -ontop- supports almost all features in SPARQL 1.1
  • 48. -ontop- Mappings mappingId Reviewer target <"&bsbm-inst;dataFromRatingSite{$publisher}/Reviewer{$nr}"> a foaf:Person; foaf:name $name; foaf:mbox_sha1sum $mbox_sha1sum; bsbm:country <"&iso3166;{$country}" >; dc:publisher <"&bsbm-inst;dataFromRatingSite{$publisher}/RatingSite{$publisher}">; dc:date $publishDate . source select nr, name, mbox_sha1sum, country, publisher, publishDate from person mappingId Producer target <"&bsbm-inst;dataFromProducer{$nr}/Producer{$nr}"> a bsbm:Producer; rdfs: label $label; rdfs:comment $comment; foaf:homepage $homepage; bsbm:country <"&iso3166; {$country}">; dc:publisher <"&bsbm-inst;dataFromProducer{$nr}/Producer{$nr}">; dc:date $publishDate . source select nr, label, comment, homepage, country, publisher, publishDate from producer ● Uses Turtle syntax. ● Specification: https://babbage.inf.unibz. it/trac/obdapublic/wiki/ObdalibObdaTurtlesyntax ● Support R2RML syntax
  • 49. Semantika Mappings <mapping tml:id="Reviewer"> <logical-table rr:tableName="person"/> <subject-map rr:class="foaf:Person" rr:template="Reviewer(publisher,nr)"/> <predicate-object-map rr:predicate="foaf:name" rr:column="name"/> <predicate-object-map rr:predicate="foaf:mbox_sha1sum" rr:column="mbox_sha1sum"/> <predicate-object-map rr:predicate="bsbm:country" rr:template="Country(country)"/> <predicate-object-map rr:predicate="dc:publisher" rr:template="ReviewerPublisher(publisher,publisher)"/> <predicate-object-map rr:predicate="dc:date" rr:column="publishDate"/> </mapping> <mapping tml:id="Producer"> <logical-table rr:tableName="producer"/> <subject-map rr:class="bsbm:Producer" rr:template="Producer(nr,nr)"/> <predicate-object-map rr:predicate="rdfs:label" rr:column="label"/> <predicate-object-map rr:predicate="rdfs:comment" rr:column="comment"/> <predicate-object-map rr:predicate="foaf:homepage" rr:column="homepage"/> <predicate-object-map rr:predicate="bsbm:country" rr:template="Country(country)"/> <predicate-object-map rr:predicate="dc:publisher" rr:template="ProducerPublisher(nr,nr)"/> <predicate-object-map rr:predicate="dc:date" rr:column="publishDate"/> </mapping> ● Uses XML format. ● Specification: https://github.com/obidea/semantika/wiki/2.-Basic-RDB-RDF- Mapping ● Support R2RML syntax
  • 51. Simple SPARQL Query SELECT ?title ?publishDate WHERE { ?review bsbm:reviewFor bsbm:Producer1245/Product62033> . ?review dc:title ?title . ?review dc:date ?publishDate . }
  • 52. Ontop SQL Creation SELECT 3 AS `titleQuestType`, NULL AS `titleLang`, QVIEW1.`title` AS `title`, 10 AS `publishDateQuestType`, NULL AS `publishDateLang`, CAST (QVIEW1.`publishDate` AS CHAR(8000) CHARACTER SET utf8) AS `publishDate` FROM review QVIEW1 WHERE (QVIEW1.`product` = '62033') AND (QVIEW1.`producer` = '1245') AND QVIEW1.`publisher` IS NOT NULL AND QVIEW1.`nr` IS NOT NULL AND QVIEW1.`title` IS NOT NULL AND QVIEW1.`publishDate` IS NOT NULL
  • 53. Semantika SQL Creation SELECT `OBDA_VIEW1`.`title` AS `title`, `OBDA_VIEW1`.`publishDate` AS `publishDate` FROM `bsbm100`.`review` AS `OBDA_VIEW1` WHERE `OBDA_VIEW1`.`publisher` IS NOT NULL AND `OBDA_VIEW1`.`product` = 62033 AND `OBDA_VIEW1`.`publishDate` IS NOT NULL AND `OBDA_VIEW1`.`nr` IS NOT NULL AND `OBDA_VIEW1`.`title` IS NOT NULL AND `OBDA_VIEW1`.`producer` = 1245
  • 54. Let’s add something more... SELECT ?review ?title ?publishDate ?rating1 ?rating2 WHERE { ?review bsbm:reviewFor bsbm:Producer1245/Product62033> . ?review dc:title ?title . ?review dc:date ?publishDate . ?review bsbm:rating1 ?rating1 . OPTIONAL { ?review bsbm:rating2 ?rating2 . } }
  • 55. Ontop SQL Creation SELECT 1 AS `reviewQuestType`, NULL AS `reviewLang`, CONCAT('http://www4.wiwiss.fu-berlin. de/bizer/bsbm/v01/instances/dataFromRatingSite', REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(CAST(QVIEW1.`publisher` AS CHAR (8000) CHARACTER SET utf8),' ', '%20'),'!', '%21'),'@', '%40'),'#', '%23'),'$', '%24'),'&', '%26'),'*', '%42'), '(', '%28'), ')', '%29'), '[', '%5B'), ']', '%5D'), ',', '%2C'), ';', '%3B'), ':', '%3A'), '?', '%3F'), '=', '%3D'), '+', '%2B'), '''', '%22'), '/', '%2F'), '/Review', REPLACE(REPLACE(REPLACE(REPLACE(REPLACE (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE (CAST(QVIEW1.`nr` AS CHAR(8000) CHARACTER SET utf8),' ', '%20'),'!', '%21'),'@', '%40'),'#', '%23'),'$', '%24'),'&', '%26'),'*', '%42'), '(', '%28'), ')', '%29'), '[', '%5B'), ']', '%5D'), ',', '%2C'), ';', '%3B'), ':', '%3A'), '?', '%3F'), '=', '%3D'), '+', '%2B'), '''', '%22'), '/', '%2F')) AS `review`, 3 AS `titleQuestType`, NULL AS `titleLang`, QVIEW1.`title` AS `title`, 10 AS `publishDateQuestType`, NULL AS `publishDateLang`, CAST(QVIEW1.`publishDate` AS CHAR(8000) CHARACTER SET utf8) AS `publishDate`, 4 AS `rating1QuestType`, NULL AS `rating1Lang`, CAST(QVIEW1.`rating1` AS CHAR(8000) CHARACTER SET utf8) AS `rating1`, 4 AS `rating2QuestType`, NULL AS `rating2Lang`, CAST(QVIEW2.`rating2` AS CHAR(8000) CHARACTER SET utf8) AS `rating2` FROM ( review QVIEW1 LEFT OUTER JOIN review QVIEW2 ON (QVIEW1.`nr` = QVIEW2.`nr`) AND (QVIEW1.`publisher` = QVIEW2.`publisher`) AND QVIEW2.`rating2` IS NOT NULL AND QVIEW1.`publisher` IS NOT NULL AND QVIEW1.`nr` IS NOT NULL ) WHERE QVIEW1.`title` IS NOT NULL AND QVIEW1.`nr` IS NOT NULL AND QVIEW1.`publishDate` IS NOT NULL AND (QVIEW1.`product` = '62033') AND QVIEW1.`publisher` IS NOT NULL AND QVIEW1.`rating1` IS NOT NULL AND (QVIEW1.`producer` = '1245')
  • 56. Semantika SQL Creation SELECT CONCAT('http://www4.wiwiss.fu-berlin. de/bizer/bsbm/v01/instances/dataFromRatingSite{1}/Review{2}',' : ','"', `OBDA_VIEW1`.`publisher`,'" "',`OBDA_VIEW1`.`nr`,'"') AS `review`, `OBDA_VIEW1`.`title` AS `title`, `OBDA_VIEW1`.`publishDate` AS `publishDate`, `OBDA_VIEW1`.`rating1` AS `rating1`, `OBDA_VIEW1`.`rating2` AS `rating2` FROM `bsbm100_optimized`.`review` AS `OBDA_VIEW1` WHERE `OBDA_VIEW1`.`publisher` IS NOT NULL AND `OBDA_VIEW1`.`product` = 62033 AND `OBDA_VIEW1`.`publishDate` IS NOT NULL AND `OBDA_VIEW1`.`nr` IS NOT NULL AND `OBDA_VIEW1`.`title` IS NOT NULL AND `OBDA_VIEW1`.`rating1` IS NOT NULL AND `OBDA_VIEW1`.`producer` = 1245