Dw design 1_dim_facts


Published on

Modelado dimensional

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Dw design 1_dim_facts

  1. 1. DATA WAREHOUSINGMulti DimensionalData Modeling.Facts and Dimensions
  2. 2. 2
  3. 3.  While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse.3
  4. 4.  End users cannot understand, remember, navigate an E/R model (not even with a GUI)  One reason is that an enterprise- level ERM would be too complex to understand.4
  5. 5.  Software cannot usefully query an E/R model5
  6. 6.  Use of E/R modeling doesn’t meet the DW purpose: intuitive and high performance querying6
  7. 7. Employee_Dim EmployeeKey EmployeeID . . . Dimension Table Time_Dim Fact Table Product_DimTimeKey Sales_Fact ProductKeyTheDate TimeKey ProductID. .. EmployeeKey .. ProductKey . CustomerKey ShipperKey $ . . . Shipper_Dim Customer_Dim ShipperKey CustomerKey ShipperID CustomerID . . . . . .7
  8. 8. Several distinct dimensions, combined with facts, enable you to answer business Dimension questions. Tables Geographic Dimension Measures Fact Table Geographic Product Time Units $ Product Facts Time8
  9. 9. Dimensions They are normally textual and descriptive descriptions of the business.9
  10. 10. Dimensions dimension tables contain relatively small amounts of relatively static data10
  11. 11. Dimensions dimension table: usually not- normalized11
  12. 12. Dimensions Independent of each other, not hierarchically related12
  13. 13.  Dimensional attributes (attributes no key) help to describe the dimensional value. Dimensional attributes13
  14. 14. Facts Fact are (usually numerical) measures of business.14
  15. 15. Facts Fact table is the largest table in the star schema and is composed of large volumes of data15
  16. 16. Facts Fact table is (often) normalized16
  17. 17. Facts fact table has a composite primary key made up of foreign keys PK = FKi17
  18. 18. Facts fact table usually contains one or more numerical facts that occur for the combination of keys that define each record measures18
  19. 19. Facts A fact table contains either detail-level facts or facts that have been aggregated (summary tables) Σ19
  20. 20. Facts Facts are:  additive  semi-additive  non-additive20
  21. 21. Facts Non-additive facts cannot be added at all.  An example of this is averages. Semi-additive facts can be aggregated along some of the dimensions and not along others:  current_Balance is a semi-additive fact as it makes sense to add them up for all accounts (whats the total current balance for all accounts in the bank?) but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information The most useful measures are: Numeric, Additive21
  22. 22.  Atomic level of data of the business process A definition of the highest level of detail that is supported in a data warehouse22
  23. 23.  A fact table usually contains facts with the same level of aggregation a proper dimensional design allows only facts of a uniform grain (the same dimensionality) to coexist in a single fact table23
  24. 24.  Some perfectly good fact tables represent measurements that have no facts! This kind of measurements is often called an event. The classic example of such a factless fact table is a record representing a student attending a class on a specific day. The dimensions are Day, Student, Professor, Course, and Location, but there are no obvious numeric facts. The tuition paid and grade received are good facts but not at the grain of the daily attendance.24
  25. 25.  Dimensions without attributes. (Such as a transaction number or order number.)  Put the attribute value into the fact table even though it is not an additive fact.25
  26. 26. 26
  27. 27. Employee_Dim EmployeeKey EmployeeIDFact table provides statistics . .for sales broken down by .product, time, employee, shipperand customer, dimensions Time_Dim Product_Dim TimeKey Sales_Fact ProductKey TheDate TimeKey TimeKey ProductID . . . EmployeeKey . . Dimensional Keys ProductKey . Multipart Key CustomerKey ShipperKey $ . Measures . . Shipper_Dim Customer_Dim ShipperKey CustomerKey ShipperID CustomerID . . . . . . 27
  28. 28. 28
  29. 29. 1. Choosing the data mart for the small group of end users we deal with.  Choose a business process to model, e.g., orders, invoices, etc.29
  30. 30. 2. Fact table granularity (the smallest defined level of data in the table) is determined.30
  31. 31. 3. Fact table dimensions are selected.  Choose the dimensions that will apply to each fact table record  Add dimensions for "everything you know" about this grain.31
  32. 32. 4. Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount.  Choose the measure that will populate each fact table record  Add numeric measured facts true to the grain32
  33. 33.  The Data Warehouse Toolkit.Second Edition.The Complete Guide to Dimensional Modeling.Ralph Kimball.Margy Ross