Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Mapping

150 views

Published on

Presentation "On the Relationship between OBDA and Relational Mapping" at the Dagstuhl 2013 Seminar on Automated Reasoning on Conceptual Schemas.

  • Be the first to comment

  • Be the first to like this

Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Mapping

  1. 1. ON THE RELATIONSHIP BETWEEN OBDA AND RELATIONAL MAPPING Diego Calvanese, Marco Montali, Mariano Rodríguez-Muro KRDB Research Group Free University of Bozen-Bolzano Thanks to A. Artale, E. Franconi, D. Lembo, M. Lenzerini, A. Mosca Dagstuhl, May 2013
  2. 2. Reasoning Traditional Approach Data Store Conceptual Schema / Ontology ResultQuery Logical Schema
  3. 3. Reasoning Reasoning Ontology-Based Data Access Logical Schema Data Store Conceptual Schema / Ontology ResultQuery
  4. 4. The usual workflow Reasoner Source Application Communication Ontology Inputs Triples Application Code
  5. 5. Problems •  Software Complexity •  Duplication •  Data refreshing •  Data structure is lost (PKEYS, FOREIGN KEYS, information about the import procedure) OBDA: Architecture, Techniques and Systems Reasoner Source Application Communication Ontology Inputs Triples Application Code
  6. 6. OBDA as an Architecture OBDA: Architecture, Techniques and Systems Reasoner Source Application Direct Communication Ontology OBDA Model Inputs
  7. 7. OBDA Models: Sources and Mappings “A formal specification of the relationship between data in a data source and the vocabulary of the ontology” OBDA: Architecture, Techniques and Systems OBDA Model Source Source Declaration A set of mappings
  8. 8. Mapping “A tuple of 2 queries, one over the source and one over the ontology, with the same signature. Intuitively, a mapping associates the data specified by qs with the answers for qo ” qs⊆qo SELECT id FROM condition WHERE c_id = 3333 ⊆ CardiacArrestPatient(?id)èq(?id) id = (23) <23> rdf:type CardiacArrestPatient
  9. 9. Example OBDA model SELECT id FROM condition WHERE c_id = 3333 ⊆CardiacArrestPatient(?id) è q(?id) SELECT id,name,age,ssn FROM patient ⊆Patient(?id) ^ name(?id,?name) ^ age(?id,?age) ^ ssn(?id, ?ssn) è q(?id,?name,?age,?ssn) id [PKEY] name age ssn 12345 John 37 xxx-999 … … … … Table: patient patient_id [FKEY] c_id [FKEY] 12345 3333 … … Table: condition
  10. 10. Example OBDA model id [PKEY] name age ssn 12345 John 37 xxx-999 … … … … Table: patient patient_id [FKEY] c_id [FKEY] 12345 3333 … … Table: condition <12345> rdf:type :Patient. <12345> :name “John”. <12345> :age “37”. <12345> :ssn “xxx-999” <12345> rdf:type :CardiacArrestPatient …
  11. 11. The Pay-off •  At least •  The source is documented •  Data handling can be done automatically (by the reasoner) •  Reduced cost of application development and maintenance •  The reasoner can analyze source and mappings to minimize the cost of inference •  The sweet spot •  On-the-fly data access •  Reasoning by query rewriting: queries posed directly over the conceptual schema, taking into account the constraints of the conceptual schema itself •  Exploitation of efficient engines to provide the answer to the query
  12. 12. Answering Queries: Rewriting • Given a query Q, a TBox T, an OBDA model <D, M> to compute a query Q’ such that: answer(Q,T,mat(D,M)) = answer(Q’,D) where mat(D,M) is the collection of assertion resulting from “materializing” the mappings into ABox assertions (assertional triples)
  13. 13. Example OBDA model SELECT id FROM condition WHERE c_id = 3333 ⤳ CardiacArrestPatient(?id) è q(?id) SELECT id,name,age,ssn FROM patient ⤳ Patient(?id) ^ name(?id,?name) ^ age(?id,?age) ^ ssn(?id, ?ssn) è q(?id,?name,?age,?ssn) id [PKEY] name age ssn 12345 John 37 xxx-999 … … … … Table: patient patient_id [FKEY] c_id [FKEY] 12345 3333 … … Table: condition
  14. 14. Query Rewriting: An example Ontology (Tbox) SubClassOf(:CardiacArrest :HearthCondition) SubClassOf(:CardiacArrestPatient :Patient) SubClassOf(:CardiacArrestPatient ObjectSomeValuesFrom(:affectedBy :CardiacArrest)) Query (SPARQL) SELECT ?p ?name ?ssn WHERE { ?p a :Patient; :name ?name; :ssn ?ssn; :age ?age :affectedBy [ a :HeartCondition ]. FILTER (?age >= 21 && ?age <= 50) }
  15. 15. Query Rewriting: An Example Rewritten query SELECT ?p ?name ?ssn WHERE { {?p a :Patient; :name ?name; :ssn ?ssn; :age ?age :affectedBy [ a :HeartCondition ]. FILTER (?age >= 21 && ?age <= 50) } UNION {?p a :Patient; :name ?name; :ssn ?ssn; :age ?age :affectedBy [ a :CardiacArrest ]. FILTER (?age >= 21 && ?age <= 50) } UNION {?p a :Patient; :name ?name; :ssn ?ssn; :age ?age; a :CardiacArrestPatient. FILTER (?age >= 21 && ?age <= 50) } UNION … }
  16. 16. Query Rewriting: An Example SQL query SELECT tp.id as p, tp.name as name, tp.age as age FROM patient tp JOIN condition tc ON tp.id = tc.patient_id WHERE c.c_id = 3333 AND tp.age >= 21 AND tp.age <= 50 ?p ?name ?ssn 12345 John xxx-999 Answer “Fast execution even in the presence of millions of assertions”
  17. 17. OBDA: Key Points • Mappings to tackle the abstraction gap between the conceptual model and the underlying data storage • Main application: query answering (read-only modality) •  Taking into account the constraints present in the conceptual schema •  Considering incomplete data at this level •  Implementation of query answering facilities: two steps •  Incorporation of the conceptual schema directly into the query •  Use of mappings to rephrase the query in terms of another query, this time directly applicable to the underlying data •  N.B.: this magic can happen only under suitable restrictions on the expressible constraints (cf. first-order rewritability)
  18. 18. Software Engineering
  19. 19. Redundancies in SW Development Data logic (transient) Global conceptual schema Application layer Persistence layer object-oriented logical schema Presentation logic mapping meta-data Application logic Data logic (persistent) relational logical schema Object-relational mapping •  No real abstraction gap between the persistent and transient data logic (“lossless” connection) •  However, they look at data from different perspectives •  Transient data logic: OO programming •  Persistent data logic: Relational DBMS •  In the OO paradigm, we have •  Hierarchies •  Different ways for navigating the associations •  Object identifiers, no explicit keys
  20. 20. Relational Mapping • Methodology to obtain a database schema (in third-normal form) out from a conceptual schema • Requires to suitably annotate the conceptual schema • To specify how to handle with hierarchies • To specify how to handle with 1-1 associations • To bridge the gap between the OO and relational world •  OO: internal object identifiers (references) •  Relational DB: primary keys •  Mandatory vs optional attributes
  21. 21. Relational Mapping: an example StaffMember code: String {P} email: String phoneNumber: String Student id: long {P} name: String surname: String surname NOT NULL ∗ 0..1 supervisor PermanentMember divisionNr: int FixedTermMember endTerm: Date endTerm NOT NULL ContractType is used as a discriminator for the subclasses {complete, disjoint} Project projectNr: long {P} manager1 ∗ ∗ ∗participant
  22. 22. Relational Mapping: an example StaffMember code: String {P} email: String phoneNumber: String Student id: long {P} name: String surname: String surname NOT NULL ∗ 0..1 supervisor PermanentMember divisionNr: int FixedTermMember endTerm: Date endTerm NOT NULL ContractType is used as a discriminator for the subclasses {complete, disjoint} Project projectNr: long {P} manager1 ∗ ∗ ∗participant Table per subclass!
  23. 23. Object-Relational Mapping Object-Relational Mapping (ORM!) Data logic (transient) Global conceptual schema Application layer Persistence layer object-oriented logical schema Presentation logic mapping meta-data Application logic Data logic (persistent) relational logical schema Object-relational mapping •  A lot of programming effort to code the synchronization between the transient and persistent data logic •  Object-relational mapping •  Annotate the transient data logic with information about relational mapping •  Exploit these annotations to •  Create the underlying database schema (if needed) •  Automatically synchronize the two layers
  24. 24. Example: Hibernate Mappings <class name="Student" table="Student" abstract="true”> <id name="id" column="personId”> <generator class="assigned"/> </id> <property name="name" not-null="true"/> <property name="surname"/> <many-to-one name="supervisor" class="StaffMember" column="memberCode"/> </class> <class name="StaffMember" abstract="true”> <id name="code" column="memberCode”><generator class="assigned"/></id> <property name="email"/> <property name="phoneNumber"/> <set name="supervisedStudents" lazy="true" inverse="true"> <key column="supervisor" not-null="true"/> <one-to-many class="Student"/> </set> <union-subclass name="FixedTermMember" table="FTMember"> <property name="endTerm" not-null="true"/> </union-subclass> <union-subclass name="PermanentMember" table="PMember"> <property name="divisionNr"/> </union-subclass> </class> …
  25. 25. Update Student s = new Student(); s.setId(1543); s.setName("John"); … //SAVE s … s.setSupervisor(m); … //SAVE s Hibernate: insert into Student (name, surname, memberCode) values (?, ?, ?) Hibernate: update Student set name=?, surname=?, memberCode=? where personId=?
  26. 26. Queries (Over the Java Model) "from StaffMember" Hibernate: select staffmembe0_.memberCode as memberCode5_, staffmembe0_.email as email5_, staffmembe0_.phoneNumber as phoneNum3_5_, staffmembe0_.endTerm as endTerm6_, staffmembe0_.divisionNr as divisionNr7_, staffmembe0_.clazz_ as clazz_ from ( select memberCode, email, phoneNumber, endTerm, null as divisionNr, 1 as clazz_ from FTMember union select memberCode, email, phoneNumber, null as endTerm, divisionNr, 2 as clazz_ from PMember ) staffmembe0_ JAVA objects
  27. 27. Queries (Over the Java Model) "select m.supervisedStudents from FixedTermMember m" Hibernate: select supervised1_.personId as personId4_, supervised1_.name as name4_, supervised1_.surname as surname4_, supervised1_.memberCode as memberCode4_ from FTMember fixedtermm0_ inner join Student supervised1_ on fixedtermm0_.memberCode=supervised1_.supervisorJAVA objects
  28. 28. Object Relational Mapping: Key Points •  Automated synch between the transient data logic (OO) and the persistent data logic (relational) •  Assumption of complete data in both layers •  The transient data logic is a lossless representation of the underlying persistent data logic •  Bidirectional mappings implicitly obtained from the annotations •  Support of many useful architectural patterns (e.g., lazy navigation of relationships) •  Constraints enforced by the framework, to maintain the alignment between the two layers according to the annotations •  Semantics? Proof of correctness?
  29. 29. Cross-Fertilization! •  Correctness for object-relational mapping techniques a là Hibernate to be studied •  Complexity investigation •  “It turns out that Hibernate is very fast if used properly” [JBOSS community] •  Query language: understanding, assessment, extension •  Exploitation of the object-relational mapping methodologies •  Principled generation of “well-balanced” relational schemas from a conceptual model/ontology (no universal tables!) •  Implicit generation of mappings from annotations •  New architectural frameworks for mappings: two-layered conceptual model •  First layer: lossless representation of the persistence storage •  Second layer: high-level ontology •  Mappings between ontologies
  30. 30. THANKS Queries?

×