Your SlideShare is downloading. ×
0
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
er2004.heqi.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

er2004.heqi.ppt

211

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
211
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National Univ. of Singapore
  • 2. Outline <ul><li>Schema integration – background </li></ul><ul><li>Schematic discrepancy </li></ul><ul><li>Representation of meta information in ER schemas </li></ul><ul><li>Resolution of schematic discrepancy in schema integration </li></ul><ul><li>Related work </li></ul><ul><li>Conclusion </li></ul>
  • 3. Schema Integration <ul><li>In DB integration , produce an integrated view which provides a unified access to heterogeneous data in source schemas. </li></ul><ul><li>In DB design , produce a global schema of a proposed DB by integrating user views in DB design. </li></ul>
  • 4. Challenges in schema integration <ul><li>Many types of conflicts among different source schemas need to be resolved in schema integration: </li></ul><ul><ul><li>Naming conflicts </li></ul></ul><ul><ul><li>Domain mismatch </li></ul></ul><ul><ul><li>Structural conflicts </li></ul></ul><ul><ul><li>Cardinality conflicts </li></ul></ul><ul><ul><li>Local constraints vs global constraints </li></ul></ul><ul><ul><li>(e.g. local vs global functional dependencies) </li></ul></ul><ul><ul><li>Schematic discrepancy </li></ul></ul><ul><ul><li>… </li></ul></ul>
  • 5. Schematic Discrepancy <ul><li>Schematic discrepancy occurs when a metadata in one database corresponds to attribute values in the other. </li></ul><ul><li>An example (next page) </li></ul><ul><ul><li>months and supplier numbers (i.e., S1, …, Sn) are modeled differently as attribute values or schema labels (in general, metadata which will be introduced later) in databases DB1, DB2, and DB3. </li></ul></ul>
  • 6. Motivation Example price is an attribute of the ternary relationship type PMS PM is a relationship type between product and month
  • 7. Contexts of schema constructs <ul><li>Conceptual modeling is always done within a particular context which is explicitly represented as a set of meta attributes with values (called metadata ). </li></ul><ul><li>Meta attributes with values specify the conditions satisfied by the instances of a schema construct (i.e., entity type, relationship type, or attribute). </li></ul>
  • 8. Ontology <ul><li>A representational vocabulary for a shared domain of discourse which includes the definitions of entity types, relationship types, and attributes. </li></ul><ul><li>We use an ontology to describe the meta information of the ER schemas of the supply example: </li></ul><ul><ul><li>Entity types : product, supplier, month </li></ul></ul><ul><ul><li>Attributes of entity types : p#, pname, s#, month </li></ul></ul><ul><ul><li>Relationship types: </li></ul></ul><ul><ul><li>PMS (a ternary supply relationship type among product , month </li></ul></ul><ul><ul><li>and supplier ) </li></ul></ul><ul><ul><li>PM (a binary relationship type between product and month ) PM is a projection of PMS. </li></ul></ul><ul><ul><li>Attributes of relationship types: price (an attribute of PMS ) </li></ul></ul>
  • 9. Example of Context <ul><li>In DB2, the entity type JAN_PROD is represented as: </li></ul><ul><ul><li>JAN_PROD = PM [ month = ‘jan’ ] </li></ul></ul><ul><ul><li>where PM and month are resp. a relationship type and an entity type from the ontology. </li></ul></ul><ul><ul><li>It means that JAN_PROD is derived from the product-month binary relationship type (i.e. PM) when the month value is ‘ jan’ . </li></ul></ul><ul><ul><li>month is a meta-attribute and jan the metadata of JAN_PROD. </li></ul></ul>
  • 10. Inheritance of Context <ul><li>Context could be specified at 4 levels of </li></ul><ul><ul><li>Databases </li></ul></ul><ul><ul><li>Entity types </li></ul></ul><ul><ul><li>Relationship types </li></ul></ul><ul><ul><li>Attributes </li></ul></ul><ul><li>The context of a higher level schema construct could be inherited by a lower level schema construct. The inheritance hierarchy of contexts is: </li></ul><ul><li> relationship type  attribute of relationship type </li></ul><ul><li>database  entity type </li></ul><ul><li> attribute of entity type </li></ul>
  • 11. Example of context inheritance <ul><li>In DB2, the attribute S1_PRICE of the entity type JAN_PROD is represented as: </li></ul><ul><ul><li>S1_PRICE = price [ supplier=’s1’, inherit ALL ] </li></ul></ul><ul><ul><li>S1_PRICE inherits ‘all’ , i.e. the context month=’jan’ , from the entity type JAN_PROD. </li></ul></ul><ul><ul><li>The representation means that each value of S1_PRICE of the entity type JAN_PROD is a price of a product supplied by supplier s1 in the month of jan . </li></ul></ul>
  • 12. Resolution of schematic discrepancy in the integration of ER schemas
  • 13. <ul><li>Basic Idea: Remove the contexts of schema constructs by transforming meta-attributes into entity types . </li></ul><ul><li>Only meta-attributes causing schematic discrepancy need to be transformed. </li></ul><ul><li>Schema transformation should keep the semantics (information and constraints) of source schemas. </li></ul>
  • 14. <ul><li>Resolve schematic discrepancy for entity types, relationship types, attributes of entity types and attributes of relationship types in order (the order conforms to the hierarchical order of context inheritance). </li></ul><ul><li>The context at database level is handled in the entity types which inherit it. </li></ul>
  • 15. An example <ul><li>Transforming DB2 into DB1 in 2 steps </li></ul><ul><ul><li>Step 1: Resolve discrepancies for the entity types JAN_PROD, …, DEC_PROD </li></ul></ul><ul><ul><ul><li>Step 1.1: Transform meta-attributes into entity types </li></ul></ul></ul><ul><ul><ul><li>Step 1.2: Merge equivalent entity types, relationship types and attributes </li></ul></ul></ul><ul><ul><li>Step 2: resolve discrepancies for the attributes S1_PRICE, …, SN_PRICE </li></ul></ul>
  • 16. <ul><li>Step 1.1: Transform the meta-attribute month of the entity type JAN_PROD (the other entity types are similar): </li></ul><ul><ul><li>Construct an entity type MONTH to model the meta info </li></ul></ul><ul><ul><li>JAN_PROD becomes PROD after removing the context </li></ul></ul><ul><ul><li>Construct a relationship type PM to relate PROD and MONTH </li></ul></ul><ul><ul><li>Attributes S1_PRICE, …, SN_PRICE are moved to PM, as they inherit the context (i.e., the month) of the entity type JAN_PROD. </li></ul></ul>PM is a relationship type between product and month
  • 17. Step 1.2: Merge the equivalent entity types, relationship types and attributes which refer to the same ontology names. Note the domains of the MONTH attributes are united.
  • 18. An example (cont.) <ul><li>Transforming DB2 into DB1 in 2 steps </li></ul><ul><ul><li>Step 1: Resolve discrepancies for the entity types JAN_PROD, …, DEC_PROD </li></ul></ul><ul><ul><li>Step 2: Resolve discrepancies for the attributes S1_PRICE, …, SN_PRICE </li></ul></ul><ul><ul><ul><li>Step 2.1: Transform meta-attributes into entity types. </li></ul></ul></ul><ul><ul><ul><li>Step 2.2: Merge equivalent entity types, relationship types and attributes. </li></ul></ul></ul><ul><ul><ul><li>Step 2.3: Remove redundant relationship types. </li></ul></ul></ul>
  • 19. <ul><li>Step 2.1: Transform the meta-attribute supplier of the attribute S1_PRICE (the other attributes are similar): </li></ul><ul><ul><li>Construct an entity type SUPPLIER to model the meta information. </li></ul></ul><ul><ul><li>Construct a relationship type PMS to relate PROD, MONTH and SUPPLIER. </li></ul></ul><ul><ul><li>S1_PRICE becomes PRICE after removing the context, and is moved to PMS. </li></ul></ul>price is an attribute of the relationship type PMS
  • 20. Step 2.2: Merge the equivalent entity types, relationship types and attributes. The domains of the S# attributes are united.
  • 21. Step 2.3: Remove the redundant relationship type PM that is a projection of PMS.
  • 22. Semantic preservation <ul><li>Our solution to schematic discrepancy preserves the semantics of source schemas in schema transformation: </li></ul><ul><ul><li>Information preservation . The instance of a schema can be losslessly converted into the instance of another schema, and conversely. </li></ul></ul><ul><ul><li>Constraint preservation. Cardinality constraints of ER schemas can be preserved in schema transformation, but in different forms in the source and transformed schemas (an example is given in the next page). </li></ul></ul>
  • 23. Constraint Preservation (E.g.) <ul><li>Functional dependency (FD) is preserved in the transformation from DB2 to DB1. </li></ul><ul><li>Suppose in each entity type JAN_PROD, …, DEC_PROD of DB2 , the FD holds: </li></ul><ul><li>P#  {S1_PRICE, …, SN_PRICE} </li></ul><ul><li>In DB1 , the FD is preserved, but in a different form: </li></ul><ul><li>{P#, S#, MONTH}  PRICE </li></ul><ul><li>In [3], we gave inference rules to derive FDs in schema transformation. </li></ul>[3] Qi He and Tok Wang Ling: Extending and inferring functional dependency in schema transformation. CIKM, 2004.
  • 24. Related work <ul><li>The definition of context as a set of meta-attributes with values is originally adopted in [2, 9]. </li></ul><ul><li>They defined context at the attribute level only . </li></ul><ul><li>We consider contexts at the levels of database, entity types and attributes , as well as the inheritance of context. </li></ul>[2] C. H. Goh, S. Bressan, S. Madnick, and M. Siegel: Context interchange: new features and formalisms for the intelligent integration of information. TOIS, 1999 [9] E. Sciore, M. Siegel, A. Rosenthal: Using semantic values to facilitate interoperability among heterogeneous information systems, TODS, 1994
  • 25. Related work <ul><li>Existing work in schema integration focused on the resolution of structural conflicts [1, 7] and constraint conflicts [6, 8]. </li></ul><ul><li>Our solution to schematic discrepancy complements those works. </li></ul><ul><li>The resolution of schematic discrepancy is followed by the resolution of other conflicts. </li></ul>[1] C. Batini, M. Lenzerini: A methodology for data schema integration in the Entity-Relationship model. IEEE Trans. on Software Engineering, 10(6), 1984 [6] Mong Li Lee, Tok Wang Ling: Resolving constraint conflicts in the integration of entity-relationship schemas. ER, 1997 [7] Mong Li Lee, Tok Wang Ling: A methodology for structural conflicts resolution in the integration of entity-relationship schemas. Knowledge and Information Sys., 5, 2003 [8] M. P. Reddy, B.E.Prasad, Amar Gupta: Formulating global integrity constraints during derivation of global schema. Data & Knowledge Engineering, 16, 1995
  • 26. Related work <ul><li>Schematic discrepancy in relational model is solved in some multidatabase languages [4, 5]. </li></ul><ul><li>They solved a special problem in schematic discrepancy: they transform relation names or attribute names to attribute values, or converse. </li></ul><ul><li>They did not consider the constraint issue in schema transformation. </li></ul><ul><li>Our work solves a general problem, and preserves cardinality constraints of ER schemas in the schema transformation. </li></ul>[4] R. Krishnamurthy, W. Litwin, W. Kent: Language features for interoperability of databases with schematic discrepancies. SIGMOD, 1991 [5] L. V. S. Lakshmanan, F. Sadri, S. N. Subramanian: SchemaSQL—an extension to SQL for multidatabase interoperability. TODS, 2001
  • 27. Conclusion <ul><li>ER model supports cardinality constraints , which facilitates the derivation of constraints in schema transformation and integration. </li></ul><ul><li>Context is used to explicitly represent meta information of entity types, relationship types and attributes in ER schemas. </li></ul><ul><li>Schematic discrepancy is resolved by removing context. </li></ul><ul><li>The solution to schematic discrepancy preserves information and constraints. </li></ul>
  • 28. Reference <ul><li>[1] C. Batini, M. Lenzerini: A methodology for data schema integration in the Entity-Relationship model. IEEE Trans. on Software Engineering, 10(6), 1984 </li></ul><ul><li>[2] C. H. Goh, S. Bressan, S. Madnick, and M. Siegel: Context interchange: new features and formalisms for the intelligent integration of information. ACM Transactions on Information Systems, 17(3), 1999, pp 270-293 </li></ul><ul><li>[3] Qi He and Tok Wang Ling: Extending and inferring functional dependency in schema transformation. CIKM, 2004. </li></ul><ul><li>[4] R. Krishnamurthy, W. Litwin, W. Kent: Language features for interoperability of databases with schematic discrepancies. SIGMOD, 1991, pp 40-49 </li></ul><ul><li>[5] L. V. S. Lakshmanan, F. Sadri, S. N. Subramanian: SchemaSQL—an extension to SQL for multidatabase interoperability. TODS, 2001, pp 476-519 </li></ul><ul><li>[6] Mong Li Lee, Tok Wang Ling: Resolving constraint conflicts in the integration of entity-relationship schemas. ER, 1997, pp 394-407 </li></ul><ul><li>[7] Mong Li Lee, Tok Wang Ling: A methodology for structural conflicts resolution in the integration of entity-relationship schemas. Knowledge and Information Sys., 5, 2003, pp 225-247 </li></ul><ul><li>[8] M. P. Reddy, B.E.Prasad, Amar Gupta: Formulating global integrity constraints during derivation of global schema. Data & Knowledge Engineering, 16, 1995, pp 241-268 </li></ul><ul><li>[9] E. Sciore, M. Siegel, A. Rosenthal: Using semantic values to facilitate interoperability among heterogeneous information systems, TODS, 19(2), 1994, pp 254-290 </li></ul>

×