Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Comprehensive Method for Data Warehouse Design

1,213 views

Published on

Published in:
5th International Workshop on Design and Management of Data Warehouses (DMDW'03), p. 1.1-1.14, Berlin (Germany), September 8 2003.

Download:
http://gplsi.dlsi.ua.es/almacenes/ver.php?pdf=48

Published in: Technology, Business
  • Be the first to comment

A Comprehensive Method for Data Warehouse Design

  1. 1. Department of Software and Computing Systems A Comprehensive Method for Data Warehouse Design Sergio Luján-Mora, Juan Trujillo (sergio.lujan@ua.es / @sergiolujanmora) Published in: 5th International Workshop on Design and Management of Data Warehouses (DMDW'03), p. 1.1-1.14, Berlin (Germany), September 8 2003. Download: http://gplsi.dlsi.ua.es/almacenes/ver.php?pdf=48
  2. 2. Department of Software and Computing Systems A Comprehensive Method for Data Warehouse Design Sergio Luján-Mora Juan Trujillo DMDW 2003
  3. 3. A Comprehensive Method for Data Warehouse Design Contents • Motivation • • • • • UML extension mechanisms DW modeling schemas Applying modeling schemas Conclusions Future Work
  4. 4. A Comprehensive Method for Data Warehouse Design Motivation • Data warehouses are complex information systems • Support: – OLAP – Data mining – Decision Support Systems –… • Building a DW: time consuming, expensive and prone to fail
  5. 5. A Comprehensive Method for Data Warehouse Design Motivation • Partial approaches: – ETL processes – Logical and conceptual design of the DW based on the multidimensional paradigm – Derive DW schema from ER schemas of the data sources –… • DW methods, but not a general model for the different phases
  6. 6. A Comprehensive Method for Data Warehouse Design Motivation • Goal: A Comprehensive Method for Data Warehouse Design • Principles that drive our approach: – Standard modeling notation  UML – Comprehensive  Include main phases of DW design – Powerful but easy to understand  Different levels of detail for different users (technical and final users) – Method  Starting point, not a rigid template
  7. 7. A Comprehensive Method for Data Warehouse Design Contents • Motivation • UML extension mechanisms • • • • DW modeling schemas Applying modeling schemas Conclusions Future Work
  8. 8. A Comprehensive Method for Data Warehouse Design UML extension mechanisms • UML is a general purpose visual modeling language for systems • Extension mechanisms allow the user to tailor it to specific domains • Mechanisms: – Stereotypes  New building elements – Tagged values  New properties – Constraints  New semantics
  9. 9. A Comprehensive Method for Data Warehouse Design UML extension mechanisms Icon Decoration Label None
  10. 10. A Comprehensive Method for Data Warehouse Design Contents • Motivation • UML extension mechanisms • DW modeling schemas • Applying modeling schemas • Conclusions • Future Work
  11. 11. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** l edo M ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) Diagrams (windows or views into the model) (BM)
  12. 12. A Comprehensive Method for Data Warehouse Design General diagram (level 0) <<ODS>>, <<DWCS>>, <<DWSS>>, <<BM>>, <<ETL>>, <<Exportation>> <<BM>> Manager <<BM>> Accounting <<DWCS>> Data warehouse <<ODS>> Sales data <<DWSS>> Informix Metacube <<ODS>> Production data <<ODS>> Syndicated data <<ETL>> Transformations <<Exportation>> Mappings <<DWSS>> Cognos PowerPlay
  13. 13. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  14. 14. A Comprehensive Method for Data Warehouse Design ODS • Operational Data Schema • Represents: – Transaction processing systems (OLTP) – External sources (census data, economic data, competitors’ data, etc.) • Not exists a UML extension for modeling different types of data sources
  15. 15. A Comprehensive Method for Data Warehouse Design ODS • RDBMS  Rational’s UML Profile for Database Design: <<Database>>, <<Schema>>, <<Table>>, … • ORDBMS  Marcos et al. UML Profile for Object-Relational Database Design: <<array>>, <<row>>, <<ref>>, … • XML  Rational’s XML-DTD UML Profile: <<DTDElement>>, <<DTDElementEmpty>>, <<DTDEntity>>, • …
  16. 16. A Comprehensive Method for Data Warehouse Design <<ODS>> Sales data 0..n 0..n 1 1..n 1 <<ODS>> Production data Salesmen 1 0..n <<ODS>> Syndicated data Cities 1 1 1 1..n Counties Groups 0..n 0..n Discount policies 0..n 0..n 1 Families 0..n 1 Products 0..n 0..n 1 1 Packages 0..n Invoices 1 Storage conditions 0..n Lines States 0..n 0..n 1 1 1 Customers 0..n Agents 0..n 1 Categories 1
  17. 17. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  18. 18. A Comprehensive Method for Data Warehouse Design DWCS • Data Warehouse Conceptual Schema • UML Profile for Multidimensional Modeling • Basic components: – Facts: the transactions or values being analyzed – Dimensions: descriptive information about the facts • Properties: – – – – Shared dimensions Heterogeneous dimensions Degenerate facts and dimensions Multiple and alternative path classification hierarchies –…
  19. 19. A Comprehensive Method for Data Warehouse Design DWCS Level 1 Level 2 Level 3 Model Star schema Dimension/fact definition definition definition
  20. 20. A Comprehensive Method for Data Warehouse Design DWCS Package stereotypes Class stereotypes StarPackage (Level 1) Fact (Level 3) FactPackage (Level 2) Dimension (Level 3) DimensionPackage (Level 2) Base (Level 3)
  21. 21. A Comprehensive Method for Data Warehouse Design Model definition (level 1) <<StarPackage>> Production schema Sales schema Salesmen schema
  22. 22. A Comprehensive Method for Data Warehouse Design Star schema definition (level 2) <<FactPackage>>, <<DimensionPackage>> Production schema Sales schema Salesmen schema Stores dimension Times dimension Sales fact Products dimension Customers dimension
  23. 23. A Comprehensive Method for Data Warehouse Design Dimension/fact definition (level 3) <<Fact>>, <<Dimension>>, <<Base>> Customers dim 1 Production schema Sales schema 1 Salesmen schema Customers +child Stores dimension Times dimension +parent 0..n 0..n +child 1 Sales fact Products dimension Customers dimension ZIPs +child 0..n +parent 1 +parent +child Cities 0..n +parent 1 1 States
  24. 24. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  25. 25. A Comprehensive Method for Data Warehouse Design DWSS • Data Warehouse Storage Schema • Depending on the implementation (RDMS, ORDBMS, MD, …)  Similar to the ODS • Two possibilities: manual or automatic
  26. 26. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  27. 27. A Comprehensive Method for Data Warehouse Design BM • Business Model • Adapt the DW to final users: – Easier to understand – Security concerns –… • UML importing mechanism  Different submodels of DWCS
  28. 28. A Comprehensive Method for Data Warehouse Design <<DWCS>> Data warehouse Production schema Sales schema <<BM>> Accounting Salesmen schema Sales schema (from Data warehouse) Importing
  29. 29. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  30. 30. A Comprehensive Method for Data Warehouse Design ETL Process • • • • Extraction-Transformation-Loading Mapping between ODS and DWCS UML Profile for Modeling ETL Processes Common mechanisms: – – – – Integration different data sources Transformati Generation of surrogate keys …
  31. 31. A Comprehensive Method for Data Warehouse Design ETL Process Aggregation Loader Conversion Log Filter Merge Incorrect Surrogate Join Wrapper
  32. 32. A Comprehensive Method for Data Warehouse Design LeftJoin(Storage = IdStorage) Name = Products.Name StName = [Storage conditions].Name StDescription = [Storage conditions].Description Storage conditions (from Sales data) - IdStorage - Name - Description Products dim 1 (from Products dimension) 0..n Products (from Sales data) - IdProduct - Name - Price - Family - Storage NewClass2 - IdProduct - Name - Price - Family - StName - StDescription ProdEuro ProdLoader ProdDescription (from Products dimension) Price = DollarToEuro(Price)
  33. 33. A Comprehensive Method for Data Warehouse Design Data Warehouse Conceptual Schema (DWCS) *** *** *** *** ETL Process *** *** *** *** *** *** *** *** Analyze *** *** *** *** Exportation Process Business Model Operational Data Schema Data Warehouse Storage Schema (ODS) (DWSS) (BM)
  34. 34. A Comprehensive Method for Data Warehouse Design Exportation Process • Mapping between DWCS and DWSS • Two possibilities: manual or automatic
  35. 35. A Comprehensive Method for Data Warehouse Design Contents • Motivation • UML extension mechanisms • DW modeling schemas • Applying modeling schemas • Conclusions • Future Work
  36. 36. A Comprehensive Method for Data Warehouse Design
  37. 37. A Comprehensive Method for Data Warehouse Design
  38. 38. A Comprehensive Method for Data Warehouse Design
  39. 39. A Comprehensive Method for Data Warehouse Design Contents • • • • Motivation UML extension mechanisms DW modeling schemas Applying modeling schemas • Conclusions • Future Work
  40. 40. A Comprehensive Method for Data Warehouse Design Conclusions • Global DW design method • Best advantages: – Same standard notation (UML) – Integration of different design phases in a single and coherent framework – Scale up to handle huge and complex DWs • CASE tool support with Rational Rose  Add-in
  41. 41. A Comprehensive Method for Data Warehouse Design Contents • • • • • Motivation UML extension mechanisms DW modeling schemas Applying modeling schemas Conclusions • Future Work
  42. 42. A Comprehensive Method for Data Warehouse Design Future work • Data mapping at attribute level • Diagramming and style guidelines for creating better diagrams • More stages of the DW life cycle (e.g., refresh processes)
  43. 43. A Comprehensive Method for Data Warehouse Design Department of Software and Computing Systems A Comprehensive Method for Data Warehouse Design Sergio Luján-Mora Juan Trujillo

×