• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Le Data Warehousing: challenge ou mode ?

Le Data Warehousing: challenge ou mode ?






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • 16 août 2010 L ’utilisateur a une perception du monde réel accès sur son appli. Description de données pour ses besoins... A partir de là, reconnaissance et structuration...
  • 16 août 2010
  • 16 août 2010 CLASSE = ensemble des objets perçus comme ayant des caractéristiques similaires Ils auront le même type. Lisa, Zoe, Dylan … sont regroupés dans la classe Personne. Ils forment la population de la classe. + propriétés: - nom - prénom - âge - taille - couleur des yeux + liens - Zoe MARIEE avec Dylan Zoe FEMME de Dylan et inverse...
  • 16 août 2010 TE Employé 2 attributs simples, monovalués et obligatoires (n°E et nom), 1 attribut simple, multivalué et obligatoire (prénoms) 2 attributs complexes et multivalués (CV et postes), dont CV facultatif et postes obligatoire. NB: la définition d'un attribut (ou d'un rôle) comme étant obligatoire induit une contrainte sur la création des occurrences correspondantes. La création d'une occurrence ne peut être acceptée que si tous les attributs (rôles) obligatoires reçoivent une valeur dès sa création.
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010
  • 16 août 2010

Le Data Warehousing: challenge ou mode ? Le Data Warehousing: challenge ou mode ? Presentation Transcript

  • From Reality to Databases: a One-to-Many Relationship Stefano Spaccapietra Database Laboratory Swiss Federal Institute of Technology Lausanne (EPFL) joint work with Christine PARENT & Christelle VANGENOT
      • http://lbd.epfl.ch
  • Outline
    • Database design essentials
    • Multiple representation
    • Design alternatives
  • Database Terminology
    • Database design (data modeling) is the activity to elaborate a formal representation of relevant information about some subset of the real world that is of interest for users (applications) of the data.
    • The outcome of the database design process is the schema of the database.
    • The formalism used to express the schema is a data model .
    Database design essentials
  • Data Model
    • A data model is a set of concepts and rules .
    • Relational data model: table/relation, attribute/column, tuple/row, primary key, foreign key, …
    • Entity-Relationship data model: entity, entity type, relationship, relationship type, attribute, role, cardinality, identifier, …
    • Object-oriented data model: object, class, attribute, reference attribute, is-a hierarchy, inheritance, …
    Database design essentials
  • Evolution of Data Models Spatio Temporal Expressive power Data Models Codasyl Relational Object Oriented ER Extended ER UML ODMG Multi- representation Database design essentials
  • Database Design: the Analysis Phase recognition structuring A database is a representation of that part of reality we are interested in. perception Real World Database design essentials
  • Database Design : the Definition Phase description Jean is a young man. He is married to Arlette, and owns a green Honda CRV. Database design essentials
  • Fundamental Abstraction: Classification
        • Object class: Person
        • properties: - family name,
        • - first name
        • - age, ...
    From reality to representation: Abstracting from details to think in more generic terms, e.g. in terms of object classes rather than individual objects. Lisa Fred …. Dylan Anne ... Zoë Database design essentials
  • The Database Schema
    • A schema is a collection of types.
    • The database will store instances of these types.
    • An instance is a set of values taken by the properties attached to the type.
    Database design essentials Person Car Owns Married-to
  • Schema and Instances Database design essentials Person House Owns 0:n 1:1
  • Attributes of an Object Type atomic, mandatory, monovalued complex, optional, multivalued Employee Emp# Ename telephones academic-achievements positions degree year title start-date end-date salaries date amount year month Database design essentials
  • Example of an ER schema Department Item Employee Supplier Boss-of boss subord. Dname floor quantity Iname type name salary Sname address quantity R E Database design essentials Assigned-to Sells Delivery
  • Non-determinism in Database Design
    • A database design is about choosing a representation
    • The outcome is a
      • partial
      • subjective
      • unfaithful
    • description
    • How do we introduce flexibility to support different ways of abstracting a representation from reality ?
    Database design essentials
  • Multiple Classification Car Vintage Car Collectible Transport Mean Vehicle Land Vehicle Ford Imported Good Movie Accessory Multiple representation
  • Multiple Viewpoints ROAD Cartographer viewpoint Multiple representation Construction engineer viewpoint Traffic manager viewpoint
  • Multiple Spatial Resolution 1:25'000 scale 1:50'000 scale
  • Multidimensional Representation Space Classification Space granularity Viewpoint Time Time granularity …… two representations of the same object in the same viewpoint at two different resolution levels Multiple representation
  • A Snapshot Database Multiple representation Classification Time Viewpoint
  • A Map Multiple representation Classification Space granularity Viewpoint Road Network 1:100'000 resolution
  • Classification Dimension students faculties persons technicians secretaries
    • Current Status: refinement hierarchies
    Person Faculty Technician Secretary Student Employee Is-a Multiple representation faculties technicians secretaries
  • Limitation: Roles car-owners companies persons Person Car-owner Company Person-with-car Company-with-car intersection classes partition constraint Multiple representation Car-owner = Person-with-car  Company-with-car Person-with-car  Company-with-car = Ø
  • A More Direct Representation Intersection link Multiple representation Car-owner OR IS-A Car-owner Company Person MAY-BE-A MAY-BE-A + partition constraint
  • Viewpoint Dimension
    • Relational DBMS support (mostly non-updatable) views, but semantics is poor
    • Object-oriented DBMS have rich semantics but poor view mechanisms
    • Object-relational DBMS: ?
    • Object-oriented expressiveness augmented with intersection links, roles and revised inheritance rules will provide the best solution
    Multiple representation
  • Space Granularity: Multi-resolution
    • Cartographic Generalization is costly:
      • -> store the result for reuse
      • How do we express the links
      • between different representations ?
      • -> update propagation
    Multiple representation
  • Multiple Geometries for the Same Object
    • One possible solution : stamping spatial attributes with the spatial resolution
    • Spatial integrity constraints :
      • Sinuosity ( River.geometry[2]) = Sinuosity ( River.geometry[1])
      • Length (River.geometry[2]) = Length ( River.geometry[1])
    Resolution Level 1 Resolution Level 2 River described as an area or as a line River mr geo M Multiple representation
  • Multiple Abstraction Levels: Reformulation
    • Replacing a group of objects with a new object
      • Example: a set of buildings close to each other is replaced with a built-up area
    Multiple representation
  • Aggregation
    • Grouping of objects according to semantic and spatial relationships
      • e.g., a set of buildings and adjacent fields belonging to the same farmer grouped into a single object Farm
    • Derivation rules:
      • Farm.geometry= Spatial Union (Field.geometry,Building.geometry)
    • Aggregation constraint:
      • the fields and the buildings composing the same farm must belong to the same farmer and the fields must be adjacent.
    Farm Field Building Composed Composed 1,n 1,n Multiple representation
  • Cartographic Approximation
    • No 1-1 or n-1 mapping between ground and cartographic buildings
    • N-m relationship
    5 ground buildings (1,2,3,4,5) represented by 3 cartographic buildings (a,b,c) A ground building can participate into 0 or 1  typify  relationship Ground Building Cartographic Building typify t = ( {1,2,3,4,5} , {a,b,c} ) Multiple representation
  • Topological Relationships Level 1 Level 2 At resolution level 1, the road is adjacent to the enbankment. At resolution level 2, the embankment is no longer represented. The road is seen as adjacent to the building. Embankment Road Near M M M Multiple representation
  • Hierarchical value domains
    • Describe the same property at different abstraction levels
      • Hierarchical value domains for attributes
      • (similar to classification hierarchies for objects)
    Multiple representation cultivated area rose iris carnation flower cereal oleaginous corn barley rape sunflower
  • Multidimensional Representation Space Classification Space granularity Viewpoint Time granularity How is the representation space - presented to users? - implemented in D databases? Design alternatives
  • Possible Design Architectures
    • One single schema
    • One schema per (combination of ) coordinate(s) on dimension(s)
    • One schema per …… with an intrinsic schema
    Design alternatives
  • A Single Schema owner landuse Parcel Building M Cartographic building owner landuse Parcel/use agr/use Parcel/owner Plot Castle composed Typify Road M along near on/under Bridge agr/owner Design alternatives
  • A multi-resolution schema per viewpoint Building M Cartographic building agr/owner owner landuse landuse Parcel/use agr/use Parcel/owner Building M Plot Castle composed Typify Road M along near Parcel Bridge on/under owner Viewpoint 1 Viewpoint 2 Design alternatives
  • A Schema Per Viewpoint and Resolution Viewpoint A Resolution 1 Design alternatives inter-schema correspondences Viewpoint A Resolution 2 Viewpoint B Resolution 2 Viewpoint B Resolution 1
  • A Schema Per Resolution and Viewpoint agr/use Cartographic building on Building Road Building Castle Plot Bridge Road On / under Parcel/use near Parcel Parcel/owner agr/owner composed Design alternatives
  • An Intrinsic Schema
    • Intrinsic schema : description of real world entities independently of any viewpoint
    Intrinsic schema schema B schema A Design alternatives
  • Murmur IST Project (2000-2002)
    • A conceptual data model supporting space, time, and multirepresentation (extension of MADS)
    • A corresponding query language (multirepresentation algebra)
    • Two application cases (cartographic, risk assessment)
    • A schema editor for visual data definition (DDL)
    • A query editor for visual data manipulation (DML), including intelligent zooming and temporal travelling
    • Implementation on a commercial GIS