2. Introduction
• Meeting “Deliver More With Less” challenge, we need to leverage all
resources especially data
• Data never lie and can input as a Historical Model that treat as
Foundation
• The Time-proven method is through Dimensional Data structures.
• Organizations often struggle to develop Dimensional Models that
consistently meet business needs due to less of knowledge and
experience
3. What is Dimension?
• A set of Attributes describes the same
Structural Thing
• Is a Business Perspective
• Eg. Date, Customer, Product, Salesman
4. How would we interpret the below statement?
• Sales Report by Store
– A Report Stating the Measurement of Sales Figures by
each Store
– Store is Dimension
– Sales Figures is Measure (Fact / Metric)
5. Degenerate Dimension
• Is a dimension key in the Fact Table that does not have its
own Dimension Table, because all the concerned attributes
have been placed in existing Dimension such as Salesman,
Customer, Transaction Date
• Contains no attributes and hence does not join to any
existing Dimension Table
6. Role-playing Dimension
• Dimensions are often recycled for multiple applications
within the same database.
• Eg. Date Dimension can be used for Invoice Date, as well as
Order Date, or Date of Birth
7. What does Key mean?
• Key is a unique identifier of the row. We have below Key in Database:
• Primary key. The Primary Unique Identifier of the row
• Foreign key. It is the other Table Primary Key. Sometime we call it a
Dimension Key in Fact Table or Mapping Key or Look-up Key to
Business Users
• Composite Key. A Unique Key that composite of two or more fields
(attributes).
• Natural key. A Unique Key that is formed of attributes that already
exist in the real world. For example, in Hong Kong, we have Hong Kong
Identity Card Number (HKID no.) for each citizen and it is unique and
also has a special meaning
• Surrogate key. A Unique Key with no business meaning. It normally
generated by Database as Incremented Key or in SQL, MAX() + 1
8. Slowly Changing Dimension (SCD)
• Dimensions that change over time
• Typically, there are three type of SCD
• Type 1: Replace the entry with a new attributes
• Type 2: Create a new entry and mark the old record as
outdated
• Type 3: Add additional column for each tracking attribute, eg.
NAME, OLD_NAME
9. Recommendation on Which SCD Type?
• If you need to track the changes over time, a SCD Type II is
recommended
• If you don’t care about the historical change on the
captioned dimension, a SCD Type I is suggested
• In general, Data Warehouse seldom implement SCD Type II
11. Assume there is a new Customer
Profile created on 2009-01-22 and
with TOP30 attribute
Custmer Attribute
Customer Name Created Date TOP30 T300 T4
Peter Chan 2009-01-22 Y
12. Initial Condition:
Custmer Attribute
Customer Name Created Date TOP30 T300 T4
Peter Chan 2009-01-22 Y
As of Date: 2009-01-22 Custmer Attribute
Customer Name Effective Start Effective End TOP30 T300 T4
Peter Chan 2009-01-22 2099-12-31 Y
In Excel, we keep the record as below:
14. Initial Condition:
As of Date: 2009-01-22 Custmer Attribute
Customer Name Effective Start Effective End TOP30 T300 T4
Peter Chan 2009-01-22 2099-12-31 Y
New Conditions:
Journal
Customer Name Effective Date Action Attribute
Peter Chan 2009-05-01 Remove TOP30
Peter Chan 2009-05-01 Add T300
In Excel, we keep the record as below:
As of Date: 2009-05-01 Custmer Attribute
Customer Name Effective Start Effective End TOP30 T300 T4
Peter Chan 2009-01-22 2009-04-30 Y
Peter Chan 2009-05-01 2099-12-31 Y
15. Initial Condition:
As of Date: 2009-05-01 Custmer Attribute
Customer Name Effective Start Effective End TOP30 T300 T4
Peter Chan 2009-01-22 2009-04-30 Y
Peter Chan 2009-05-01 2099-12-31 Y
New Condition:
Journal
Customer Name Effective Date Action Attribute
Peter Chan 2009-06-01 Add TOP30
In Excel, we keep the record as below:
As of Date: 2009-06-01 Custmer Attribute
Customer Name Effective Start Effective End TOP30 T300 T4
Peter Chan 2009-01-22 2009-04-30 Y
Peter Chan 2009-05-01 2009-05-31 Y
Peter Chan 2009-06-01 2099-12-31 Y Y