Le Data Warehousing: challenge ou mode ?


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 16 août 2010 L ’utilisateur a une perception du monde réel accès sur son appli. Description de données pour ses besoins... A partir de là, reconnaissance et structuration...
  • 16 août 2010
  • 16 août 2010 CLASSE = ensemble des objets perçus comme ayant des caractéristiques similaires Ils auront le même type. Lisa, Zoe, Dylan … sont regroupés dans la classe Personne. Ils forment la population de la classe. + propriétés: - nom - prénom - âge - taille - couleur des yeux + liens - Zoe MARIEE avec Dylan Zoe FEMME de Dylan et inverse...
  • 16 août 2010 TE Employé 2 attributs simples, monovalués et obligatoires (n°E et nom), 1 attribut simple, multivalué et obligatoire (prénoms) 2 attributs complexes et multivalués (CV et postes), dont CV facultatif et postes obligatoire. NB: la définition d'un attribut (ou d'un rôle) comme étant obligatoire induit une contrainte sur la création des occurrences correspondantes. La création d'une occurrence ne peut être acceptée que si tous les attributs (rôles) obligatoires reçoivent une valeur dès sa création.
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • Le Data Warehousing: challenge ou mode ?

    1. 1. From Reality to Databases: a One-to-Many Relationship Stefano Spaccapietra Database Laboratory Swiss Federal Institute of Technology Lausanne (EPFL) joint work with Christine PARENT & Christelle VANGENOT <ul><ul><li>http://lbd.epfl.ch </li></ul></ul>
    2. 2. Outline <ul><li>Database design essentials </li></ul><ul><li>Multiple representation </li></ul><ul><li>Design alternatives </li></ul>
    3. 3. Database Terminology <ul><li>Database design (data modeling) is the activity to elaborate a formal representation of relevant information about some subset of the real world that is of interest for users (applications) of the data. </li></ul><ul><li>The outcome of the database design process is the schema of the database. </li></ul><ul><li>The formalism used to express the schema is a data model . </li></ul>Database design essentials
    4. 4. Data Model <ul><li>A data model is a set of concepts and rules . </li></ul><ul><li>Relational data model: table/relation, attribute/column, tuple/row, primary key, foreign key, … </li></ul><ul><li>Entity-Relationship data model: entity, entity type, relationship, relationship type, attribute, role, cardinality, identifier, … </li></ul><ul><li>Object-oriented data model: object, class, attribute, reference attribute, is-a hierarchy, inheritance, … </li></ul>Database design essentials
    5. 5. Evolution of Data Models Spatio Temporal Expressive power Data Models Codasyl Relational Object Oriented ER Extended ER UML ODMG Multi- representation Database design essentials
    6. 6. Database Design: the Analysis Phase recognition structuring A database is a representation of that part of reality we are interested in. perception Real World Database design essentials
    7. 7. Database Design : the Definition Phase description Jean is a young man. He is married to Arlette, and owns a green Honda CRV. Database design essentials
    8. 8. Fundamental Abstraction: Classification <ul><ul><ul><li>Object class: Person </li></ul></ul></ul><ul><ul><ul><li>properties: - family name, </li></ul></ul></ul><ul><ul><ul><li>- first name </li></ul></ul></ul><ul><ul><ul><li>- age, ... </li></ul></ul></ul>From reality to representation: Abstracting from details to think in more generic terms, e.g. in terms of object classes rather than individual objects. Lisa Fred …. Dylan Anne ... Zoë Database design essentials
    9. 9. The Database Schema <ul><li>A schema is a collection of types. </li></ul><ul><li>The database will store instances of these types. </li></ul><ul><li>An instance is a set of values taken by the properties attached to the type. </li></ul>Database design essentials Person Car Owns Married-to
    10. 10. Schema and Instances Database design essentials Person House Owns 0:n 1:1
    11. 11. Attributes of an Object Type atomic, mandatory, monovalued complex, optional, multivalued Employee Emp# Ename telephones academic-achievements positions degree year title start-date end-date salaries date amount year month Database design essentials
    12. 12. Example of an ER schema Department Item Employee Supplier Boss-of boss subord. Dname floor quantity Iname type name salary Sname address quantity R E Database design essentials Assigned-to Sells Delivery
    13. 13. Non-determinism in Database Design <ul><li>A database design is about choosing a representation </li></ul><ul><li>The outcome is a </li></ul><ul><ul><li>partial </li></ul></ul><ul><ul><li>subjective </li></ul></ul><ul><ul><li>unfaithful </li></ul></ul><ul><li> description </li></ul><ul><li>How do we introduce flexibility to support different ways of abstracting a representation from reality ? </li></ul>Database design essentials
    14. 14. Multiple Classification Car Vintage Car Collectible Transport Mean Vehicle Land Vehicle Ford Imported Good Movie Accessory Multiple representation
    15. 15. Multiple Viewpoints ROAD Cartographer viewpoint Multiple representation Construction engineer viewpoint Traffic manager viewpoint
    16. 16. Multiple Spatial Resolution 1:25'000 scale 1:50'000 scale
    17. 17. Multidimensional Representation Space Classification Space granularity Viewpoint Time Time granularity …… two representations of the same object in the same viewpoint at two different resolution levels Multiple representation
    18. 18. A Snapshot Database Multiple representation Classification Time Viewpoint
    19. 19. A Map Multiple representation Classification Space granularity Viewpoint Road Network 1:100'000 resolution
    20. 20. Classification Dimension students faculties persons technicians secretaries <ul><li>Current Status: refinement hierarchies </li></ul>Person Faculty Technician Secretary Student Employee Is-a Multiple representation faculties technicians secretaries
    21. 21. Limitation: Roles car-owners companies persons Person Car-owner Company Person-with-car Company-with-car intersection classes partition constraint Multiple representation Car-owner = Person-with-car  Company-with-car Person-with-car  Company-with-car = Ø
    22. 22. A More Direct Representation Intersection link Multiple representation Car-owner OR IS-A Car-owner Company Person MAY-BE-A MAY-BE-A + partition constraint
    23. 23. Viewpoint Dimension <ul><li>Relational DBMS support (mostly non-updatable) views, but semantics is poor </li></ul><ul><li>Object-oriented DBMS have rich semantics but poor view mechanisms </li></ul><ul><li>Object-relational DBMS: ? </li></ul><ul><li>Object-oriented expressiveness augmented with intersection links, roles and revised inheritance rules will provide the best solution </li></ul>Multiple representation
    24. 24. Space Granularity: Multi-resolution <ul><li>Cartographic Generalization is costly: </li></ul><ul><ul><li>-> store the result for reuse </li></ul></ul><ul><ul><li>How do we express the links </li></ul></ul><ul><ul><li>between different representations ? </li></ul></ul><ul><ul><li>-> update propagation </li></ul></ul>Multiple representation
    25. 25. Multiple Geometries for the Same Object <ul><li>One possible solution : stamping spatial attributes with the spatial resolution </li></ul><ul><li>Spatial integrity constraints : </li></ul><ul><ul><li>Sinuosity ( River.geometry[2]) = Sinuosity ( River.geometry[1]) </li></ul></ul><ul><ul><li>Length (River.geometry[2]) = Length ( River.geometry[1]) </li></ul></ul>Resolution Level 1 Resolution Level 2 River described as an area or as a line River mr geo M Multiple representation
    26. 26. Multiple Abstraction Levels: Reformulation <ul><li>Replacing a group of objects with a new object </li></ul><ul><ul><li>Example: a set of buildings close to each other is replaced with a built-up area </li></ul></ul>Multiple representation
    27. 27. Aggregation <ul><li>Grouping of objects according to semantic and spatial relationships </li></ul><ul><ul><li>e.g., a set of buildings and adjacent fields belonging to the same farmer grouped into a single object Farm </li></ul></ul><ul><li>Derivation rules: </li></ul><ul><ul><li>Farm.geometry= Spatial Union (Field.geometry,Building.geometry) </li></ul></ul><ul><li>Aggregation constraint: </li></ul><ul><ul><li>the fields and the buildings composing the same farm must belong to the same farmer and the fields must be adjacent. </li></ul></ul>Farm Field Building Composed Composed 1,n 1,n Multiple representation
    28. 28. Cartographic Approximation <ul><li>No 1-1 or n-1 mapping between ground and cartographic buildings </li></ul><ul><li>N-m relationship </li></ul>5 ground buildings (1,2,3,4,5) represented by 3 cartographic buildings (a,b,c) A ground building can participate into 0 or 1  typify  relationship Ground Building Cartographic Building typify t = ( {1,2,3,4,5} , {a,b,c} ) Multiple representation
    29. 29. Topological Relationships Level 1 Level 2 At resolution level 1, the road is adjacent to the enbankment. At resolution level 2, the embankment is no longer represented. The road is seen as adjacent to the building. Embankment Road Near M M M Multiple representation
    30. 30. Hierarchical value domains <ul><li>Describe the same property at different abstraction levels </li></ul><ul><ul><li>Hierarchical value domains for attributes </li></ul></ul><ul><ul><li>(similar to classification hierarchies for objects) </li></ul></ul>Multiple representation cultivated area rose iris carnation flower cereal oleaginous corn barley rape sunflower
    31. 31. Multidimensional Representation Space Classification Space granularity Viewpoint Time granularity How is the representation space - presented to users? - implemented in D databases? Design alternatives
    32. 32. Possible Design Architectures <ul><li>One single schema </li></ul><ul><li>One schema per (combination of ) coordinate(s) on dimension(s) </li></ul><ul><li>One schema per …… with an intrinsic schema </li></ul>Design alternatives
    33. 33. A Single Schema owner landuse Parcel Building M Cartographic building owner landuse Parcel/use agr/use Parcel/owner Plot Castle composed Typify Road M along near on/under Bridge agr/owner Design alternatives
    34. 34. A multi-resolution schema per viewpoint Building M Cartographic building agr/owner owner landuse landuse Parcel/use agr/use Parcel/owner Building M Plot Castle composed Typify Road M along near Parcel Bridge on/under owner Viewpoint 1 Viewpoint 2 Design alternatives
    35. 35. A Schema Per Viewpoint and Resolution Viewpoint A Resolution 1 Design alternatives inter-schema correspondences Viewpoint A Resolution 2 Viewpoint B Resolution 2 Viewpoint B Resolution 1
    36. 36. A Schema Per Resolution and Viewpoint agr/use Cartographic building on Building Road Building Castle Plot Bridge Road On / under Parcel/use near Parcel Parcel/owner agr/owner composed Design alternatives
    37. 37. An Intrinsic Schema <ul><li>Intrinsic schema : description of real world entities independently of any viewpoint </li></ul>Intrinsic schema schema B schema A Design alternatives
    38. 38. Murmur IST Project (2000-2002) <ul><li>A conceptual data model supporting space, time, and multirepresentation (extension of MADS) </li></ul><ul><li>A corresponding query language (multirepresentation algebra) </li></ul><ul><li>Two application cases (cartographic, risk assessment) </li></ul><ul><li>A schema editor for visual data definition (DDL) </li></ul><ul><li>A query editor for visual data manipulation (DML), including intelligent zooming and temporal travelling </li></ul><ul><li>Implementation on a commercial GIS </li></ul>