This document discusses approaches to making ontology-based data access work effectively in practice. It addresses two main challenges: dealing with redundancy due to completeness of data, and efficiently completing virtual ABoxes. The author proposes a two-level approach involving characterizing completeness through ABox dependencies to handle redundancy, and using efficient techniques like query rewriting to virtually complete ABoxes during query answering.
First talk where I introduce the semantic index technique for query answering with inferences, the T-mappings technique (a mapping transformation/optimisation technique to avoid exponential blows during query rewriting) and the role of dependencies in query answering by query rewriting.
First talk where I introduce the semantic index technique for query answering with inferences, the T-mappings technique (a mapping transformation/optimisation technique to avoid exponential blows during query rewriting) and the role of dependencies in query answering by query rewriting.
UrbnApparel is the latest in premium luxury leisurewear, it encapsulates sophisticated menswear tailored specifically for fashionable men constantly on the go. Not only is the collection inspired and refined but each piece is skillfully crafted with meticulous detailing resulting in the ultimate unique clothing experience.
UrbnApparel aims to add a casually sophisticated flair to every man’s wardrobe with classic cuts, elegant prints and signature embroideries that are timeless and ever so comfortable. Men can wear the quality line for any occasion whether It’d be lounging with friends, running errands, or attending casual meetings — what more could men ask for?
Descripción del mecanismo de acción de la colchina en el tratamiento de la fiebre mediterránea familiar, y de los nuevos medicamentos que se están empleando para tratar esta dolencia.
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
The large-scale analysis of scholarly artifact usage is constrained primarily by current practices in usage data archiving, privacy issues concerned with the dissemination of usage data, and the lack of a practical ontology for modeling the usage domain. As a remedy to the third constraint, this article presents a scholarly ontology that was engineered to represent those classes for which large-scale bibliographic and usage data exists, supports usage research, and whose instantiation is scalable to the order of 50 million articles along with their associated artifacts (e.g. authors and journals) and an accompanying 1 billion usage events. The real world instantiation of the presented abstract ontology is a semantic network model of the scholarly community which lends the scholarly process to statistical analysis and computational support. We present the ontology, discuss its instantiation, and provide some example inference rules for calculating various scholarly artifact metrics.
Presentation of the main IR models
Presentation of our submission to TREC KBA 2014 (Entity oriented information retrieval), in partnership with Kware company (V. Bouvier, M. Benoit)
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
Invited Presentation given at the University of Illinois Urbana Champaign iSchool, Center for Informatics Research in Science and Scholarship, CIRSS Seminar, Friday, February 17, 2017.
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
Presentation by Nathan Schneider, Assistant Professor of Linguistics and Computer Science at Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019 (https://www.meetup.com/DC-NLP/events/264894589/).
The Ins and Outs of Preposition Semantics: Challenges in Comprehensive Corpu...Seth Grimes
Presentation by Nathan Scheider, Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019, https://www.meetup.com/DC-NLP/events/264894589/.
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
Presentation given at the International Data Curation Conference (#IDCC!6) in Amsterdam, at the "A Context-driven Approach to Data Curation for Reuse" workshop (organized by Ixchel Faniel and Elizabeth Yakel) on Monday, February 22, 2015
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...RuleML
This paper is devoted to formats and translations for Datalog+. We
first introduce the dlgp format, which extends classical Datalog format to Datalog+.
It allows to encode facts, existential rules (including equality), negative
constraints and conjunctive queries. Moreover, for compatibility with Semantic
Web languages, this format includes Web notions (IRIs and literals, according to
Turtle syntax). Second, we define a translation from dlgp to the Datalog+ fragment
of RuleML. Third, we define a translation from OWL 2 to dlgp. We point
out that the composition of both translations allows to import OWL 2 to RuleML.
The associated parsers and translators are available.
UrbnApparel is the latest in premium luxury leisurewear, it encapsulates sophisticated menswear tailored specifically for fashionable men constantly on the go. Not only is the collection inspired and refined but each piece is skillfully crafted with meticulous detailing resulting in the ultimate unique clothing experience.
UrbnApparel aims to add a casually sophisticated flair to every man’s wardrobe with classic cuts, elegant prints and signature embroideries that are timeless and ever so comfortable. Men can wear the quality line for any occasion whether It’d be lounging with friends, running errands, or attending casual meetings — what more could men ask for?
Descripción del mecanismo de acción de la colchina en el tratamiento de la fiebre mediterránea familiar, y de los nuevos medicamentos que se están empleando para tratar esta dolencia.
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
The large-scale analysis of scholarly artifact usage is constrained primarily by current practices in usage data archiving, privacy issues concerned with the dissemination of usage data, and the lack of a practical ontology for modeling the usage domain. As a remedy to the third constraint, this article presents a scholarly ontology that was engineered to represent those classes for which large-scale bibliographic and usage data exists, supports usage research, and whose instantiation is scalable to the order of 50 million articles along with their associated artifacts (e.g. authors and journals) and an accompanying 1 billion usage events. The real world instantiation of the presented abstract ontology is a semantic network model of the scholarly community which lends the scholarly process to statistical analysis and computational support. We present the ontology, discuss its instantiation, and provide some example inference rules for calculating various scholarly artifact metrics.
Presentation of the main IR models
Presentation of our submission to TREC KBA 2014 (Entity oriented information retrieval), in partnership with Kware company (V. Bouvier, M. Benoit)
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
Invited Presentation given at the University of Illinois Urbana Champaign iSchool, Center for Informatics Research in Science and Scholarship, CIRSS Seminar, Friday, February 17, 2017.
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
Presentation by Nathan Schneider, Assistant Professor of Linguistics and Computer Science at Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019 (https://www.meetup.com/DC-NLP/events/264894589/).
The Ins and Outs of Preposition Semantics: Challenges in Comprehensive Corpu...Seth Grimes
Presentation by Nathan Scheider, Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019, https://www.meetup.com/DC-NLP/events/264894589/.
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
Presentation given at the International Data Curation Conference (#IDCC!6) in Amsterdam, at the "A Context-driven Approach to Data Curation for Reuse" workshop (organized by Ixchel Faniel and Elizabeth Yakel) on Monday, February 22, 2015
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...RuleML
This paper is devoted to formats and translations for Datalog+. We
first introduce the dlgp format, which extends classical Datalog format to Datalog+.
It allows to encode facts, existential rules (including equality), negative
constraints and conjunctive queries. Moreover, for compatibility with Semantic
Web languages, this format includes Web notions (IRIs and literals, according to
Turtle syntax). Second, we define a translation from dlgp to the Datalog+ fragment
of RuleML. Third, we define a translation from OWL 2 to dlgp. We point
out that the composition of both translations allows to import OWL 2 to RuleML.
The associated parsers and translators are available.
It has been a longstanding challenge in geometric morphometrics and medical imaging to infer the physical locations (or regions) of 3D shapes that are most associated with a given response variable (e.g. class labels) without needing common predefined landmarks across the shapes, computing correspondence maps between the shapes, or requiring the shapes to be diffeomorphic to each other. In this talk, we introduce SINATRA: the first statistical pipeline for sub-image analysis which identifies physical shape features that explain most of the variation between two classes without the aforementioned requirements. We also illustrate how the problem of 3D sub-image analysis can be mapped onto the well-studied problem of variable selection in nonlinear regression models. Here, the key insight is that tools from integral geometry and differential topology, specifically the Euler characteristic, can be used to transform a 3D mesh representation of an image or shape into a collection of vectors with minimal loss of geometric information.
Crucially, this transform is invertible. The two central statistical, computational, and mathematical innovations of our method are: (1) how to perform robust variable selection in the transformed space of vectors, and (2) how to pullback the most informative features in the transformed space to physical locations or regions on the original shapes. We highlight the utility, power, and properties of our method through detailed simulation studies, which themselves are a novel contribution to 3D image analysis. Finally, we apply SINATRA to a dataset of mandibular molars from four different genera of primates and demonstrate the ability to identify unique morphological properties that summarize phylogeny.
Ontologies are used in numerous research disciplines and commercial applications to uniformly and semantically annotate real-world objects. Often there are multiple interrelated ontologies in a domain, and repositories such as BioPortal already provide mappings (links) between these ontologies. Especially manually verified mappings can be reused 1) to create new mappings between so far unconnected sources, and 2) to avoid an expensive re-identification, e.g. when the underlying ontologies change.
New ontology mappings can be determined by reusing and composing previously determined mappings that involve intermediate ontologies. The composition of mappings is very efficient and can achieve mappings of very high quality especially for valuable intermediate ontologies. Moreover, due to a rapid development of application domains, ontologies are frequently changed to include up-to-date knowledge. These changes dramatically influence dependent data as well as applications like ontology mappings and ontology-based annotations. Thus existing mappings may become invalid and need to be migrated to the most recent ontology versions, such that users and dependent applications can consume up-to-date mappings.
In this talk, I will give a brief introduction to ontology mappings and provide an overview on reuse-based approaches for mapping creation and maintenance, currently studied at the Database Group at Leipzig University.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Introduction to query rewriting optimisation with dependencies
1. Dependencies
Making Ontology Based Data Access Work in Practice
Mariano Rodriguez-Muro and Diego Calvanese
{rodriguez,calvanese}@inf.unibz.it
KRDB Research Centre
Free University of Bozen Bolzano
July, 2011
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 1 / 33
3. DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
4. DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Ontologies
• Intentional knowledge: TBox T .
• Extensional knowledge: ABox A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
5. OBDA with DL-Lite
A family of light-weight ontology languages
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
6. OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
7. OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
8. OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
• DL-LiteF TBoxes
B B | B ¬B | (funct R)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
9. OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
• DL-LiteF TBoxes
B B | B ¬B | (funct R)
• DL-LiteF ABoxes
A(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
11. Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
12. Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q, O).
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
13. Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q, O).
The promise
We can do this as efficiently as answering DB queries, also in the virtual
setting.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
14. Query Answering with PerfectRef (2005)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
18. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
19. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
20. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
21. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
22. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
23. Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
24. What can we do?
?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 8 / 33
25. Query Answering
It is not only about existential constants
Query:
q(x, y) ← Person(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33
27. The full picture: Ontology Based Data
Access
SourceUser Source
User
Queries
Ontology
Mappings
Source
To deal with OBDA we need to consider:
• If in the backend we have RDBMSs, we cannot go beyond their
capabilities.
• All systems are composed by T , D = R, I , M.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 10 / 33
28. First Observation
Is my data complete?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
29. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
30. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
31. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
32. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
33. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
34. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
• This happens a lot!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
35. First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
36. Second Observation
There are no ABoxes
THERE ARE NO ABOXES!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
37. Second Observation
There are no ABoxes
THERE ARE NO ABOXES!
Any Ontology based query answering systems today:
• Uses relational DBs to store the ABox data;
• In such D, both, R and I can be manipulated;
• Implementors may choose any M for their system;
Opportunity
To complete an ABox we can do more than expansion.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
38. How to approach the problem
Two level approach
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
39. How to approach the problem
Two level approach
How to approach OBDA in practice?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
40. How to approach the problem
Two level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
41. How to approach the problem
Two level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
43. Characterizing completeness
ABox Dependencies
Definition
An assertion B A B that restricts valid ABoxes.
Syntax B2 A B2
Semantics: A |= Manager A Employee if Manager(x)∈ A implies
Employee(x)∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
44. Characterizing completeness
ABox Dependencies
Definition
An assertion B A B that restricts valid ABoxes.
Syntax B2 A B2
Semantics: A |= Manager A Employee if Manager(x)∈ A implies
Employee(x)∈ A.
ABox dependencies are fundamentally different than TBox assertions.
Think open world
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
45. Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
46. Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
47. Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
48. Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
49. When is an assertion redundant?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
50. When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
51. When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
52. When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
53. When is an assertion redundant?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
54. When is an assertion redundant?
Direct Redundancy: Case 2
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
55. When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
56. When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
57. When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a | hasFather(ramon, a ) ∧ Person(a ) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
58. When is an assertion redundant?
Indirect Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
59. When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
60. When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
61. When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
62. Formalization: Redundancy
Given a TBox T and a set of dependencies Σ over T , the optimized version
of T w.r.t. Σ, denoted optim(T , Σ), is the set of inclusion assertions
{α ∈ sat(T ) | α is not redundant in sat(T ) w.r.t. sat(Σ)}
We can compute optim(T , Σ) in linear time.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 20 / 33
64. General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
65. General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
66. General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
67. General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B ∃R without the need for new
data.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
68. Semantic Index for OBDA
General Idea
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
69. Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
70. Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
71. Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
72. Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the index
and ranges!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
73. Semantic Index Example
T = {B A, C A, C D}
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
74. Semantic Index Example
T = {B A, C A, C D}
A
B C
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
75. Semantic Index Example
T = {B A, C A, C D}
1
A
B
2
C
3
4
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
76. Semantic Index Example
T = {B A, C A, C D}
1
A
B
2
C
3
4
D
We create a table TC with constant and idx columns. To insert the data
we use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
77. Semantic Index Example
T = {B A, C A, C D}
1, {(1, 3)}
A
B
2, {(2, 2)}
C
3, {(3, 3)}
4, {(3, 4)}
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
78. Semantic Index Example
T = {B A, C A, C D}
1, {(1, 3)}
A
B
2, {(2, 2)}
C
3, {(3, 3)}
4, {(3, 4)}
D
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
79. Experimentation I
The Resource Index features:
• Search over 22 document collections
• Semantics given by the hierarchies of 200 ontologies (SNOMED, GO)
Implementation in a nutshell:
(i) Understand documents with natural language processing and
annotate
Cervical Cancer( doc224 )
(ii) Expand the ABox
(iii) Pose queries that retrieve documents as
q(x) ← A1(x) ∧ · · · ∧ An(x)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 25 / 33
80. Experimentation II
The challenge:
• ≈ 3 million concepts and ≈ 2.5 million is-a assertions
• Split second responses
• 150 GB of data
• Expansion data: 1.5 TB
The experimentation data:
• Clinical Trials.gov (CT)
• 181 million assertion (≈ 14 GB of data, ≈ 140 GB when expanded.)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 26 / 33
81. Results
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
82. Results
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Results:
• Traditional reformulation: Union of 467874 SQL SPJ queries;
• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Time
to compute semantic index: 1 min; Size of data: +≈ 4 GB.
• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansion
time ≈ 7 days; Size of data +≈ 126 GB.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
83. The Query
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
SELECT DISTINCT r0.element_id as element_id
FROM
RESOURCE_INDEX.CT_ANN r0 JOIN RESOURCE_INDEX.CT_ANN r1
ON r0.element_id = r1.element_id
JOIN RESOURCE_INDEX.CT_ANN r2
ON r1.element_id = r2.element_id
WHERE
((r0.idx >= 1783559 AND r0.idx <= 1783657)) AND
((r1.idx >= 1782996 AND r1.idx <= 1783029)) AND
((r2.idx >= 1783115 AND r2.idx <= 1783253));
Standard SQL query efficient in ANY DBMS.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 28 / 33
84. Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account more
than only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can complete
ABoxes.
• We presented to efficient ways to complete ABoxes, one for the
general OBDA setting and one for the virtual setting.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
85. Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account more
than only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can complete
ABoxes.
• We presented to efficient ways to complete ABoxes, one for the
general OBDA setting and one for the virtual setting.
Future work
• Exploring more expressive languages.
• Exploring the RDFS/SPARQL setting.
• Handling updates of T and A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
87. First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
88. First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributes
id, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧
WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id, dept) ← Manager(id)∧
MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have that
Employee(John).
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
89. First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributes
id, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧
WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id, dept) ← Manager(id)∧
MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have that
Employee(John).
This is an indicator of completeness of all ABoxes A for M and R, e.g., A
is complete w.r.t. Manager A Employee.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
90. Formalization: Chains
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies over
T . A T -chain from B to C in T (resp., a Σ-chain from B to C in Σ) is a
sequence of concept inclusion assertions (Bi Bi )n
i=0 in T (resp., a
sequence of inclusion dependencies (Bi A Bi )n
i=0 in Σ), for some n ≥ 0,
such that:
1 B0 = B, Bn = C, and
2 for 1 ≤ i ≤ n, we have that Bi−1 and Bi are basic concepts s.t., either
(i) Bi−1 = Bi , or
(ii) Bi−1 = ∃R and Bi = ∃R−
, for some basic role R.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 32 / 33
91. Formalization: Redundancy
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies. The
concept inclusion assertion B C is directly redundant in T w.r.t. Σ if
(i) Σ |= B A C and
(ii) for every T -chain (Bi Bi )n
i=0 with Bn = B in T , there is a Σ-chain
(Bi A Bi )n
i=0.
Then, B C is redundant in T w.r.t. Σ if
(a) it is directly redundant, or
(b) there exists B = B s.t.
(i) T |= B C,
(ii) B C is not redundant in T w.r.t. Σ, and
(iii) B B is directly redundant in T w.r.t. Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 33 / 33