2. OUR SERVICES
Free Training and Educational Services
Training and Education in Bangla:
Bangla.SaLearningSchool.com
Training and Education in English:
www.SaLearningSchool.com
English.SaLearningSchool.com
http://sitestree.com
Ask a question and get answers:
Ask.JustEtc.net
3. DESIGNING DIMENSIONS
Dimension Field/Column Types
Yes, when designing dimension tables, you need
to define the following types of columns/fields to
facilitate with reporting and analysis
Keys : Used to identify entities
Name columns: Used for human names of entities
Attributes: Used for pivoting in analyses
Member properties: Used for labels in a report
Lineage columns: Used for auditing, and never
exposed to end users
4. DESIGNING DIMENSIONS
You need to design your dimensions keeping analysis in mind
Yes, reporting need to be in your mind for sure
For analysis, we use
Pivot Table
Pivot Graph
For Dimensions
The fields used as for pivoting are called
Attributes
Not all columns in a dimension are attributes
in OLTP tables, all columns are attributes
Attributes:
The fields based on what
analysis are done
In previous slide
you saw the different types of columns in a dimension table
5. DIMENSION ATTRIBUTES
Attributes
For pivoting
discrete attributes with a small number of distinct
values are the most appropriate
Attribute values should not be continuous
Keys are not good candidates for pivoting and
analysis; and so, not great for attributes
To make continuous column for pivoting
Convert/utilize it as a small set of discrete values
6. ON DIMENSION ATTRIBUTES
SQL Server Analysis Service (SSAS) can
discretize continuous columns to achieve
discrete attributes
Not always great (the automated process)
you need to keep business perspectives as well
Such as, 1 year difference in age can be significant at
young ages
though may not matter when the age is 60 (depends on the
business perspective as well)
Considering, we are using age for pivoting
Age and Income are not good candidates for auto
discretize
7. NAMING COLUMNS, AND MEMBER PROPERTIES
Naming columns (another dimension column
type) to identify the entity
Not good for pivoting or keys
Such as Address, city, or phones
Member Properties
Columns used in reports as labels only, not for
pivoting, are called member properties.
Can include translations i.e. Naming/member
properties
8. LINEAGE AND AUDITING
Lineage and auditing columns
Used for auditing data
Never exposed to the users
9. AUDITING AND LINEAGE
In data warehouse, you may want some
auditing tables
For every update, you should audit
who made the update,
when it was made,
and how many rows were transferred
to each dimension and
fact table
in your Data Warehouse
10. AUDITING AND LINEAGE
You will need additional fields/columns in
your dimension and fact tables to track
When, and who, and from where the row data
was/were updated
Your ETL process needs to be updated
If you used SSIS for the ETL
Modify SSIS packages so that you can record these
information
12. POSSIBLE ATTRIBUTES FOR CUSTOMER DIMENSION
Possible Attributes for Customer Dimension
BirthDate (after calculating age and discretizing the age)
MaritalStatus
Gender
YearlyIncome (after discretizing)
TotalChildren
NumberChildrenAtHome
EnglishEducation (other education columns are for
translations)
EnglishOccupation (other occupation columns are for
translations)
HouseOwnerFlag
NumberCarsOwned
CommuteDistance
14. DATE DIMENSION ATTRIBUTES
FullDateAlternateKey (denotes a date in date format)
EnglishMonthName
CalendarQuarter
CalendarSemester
CalendarYear
Drill Down attributes
CalendarYear →CalendarSemester → CalendarQuarter
→ EnglishMonthName → FullDateAlternateKey.
Usually leaf nodes appear in reports – when you can see
a drill down attribute hierarchies
15. DRILL DOWN HIERARCHIES
dimension columns used in reports for labels
are called member properties. – we already know
In a Snowflake schema
lookup tables show you levels of hierarchies
In a Star schema
you need to extract natural hierarchies from the
names and content of columns.
Nevertheless, because drilling down through natural
hierarchies is so useful and welcomed by end users,
you should use them as much as possible.
16. SLOWLY CHANGING DIMENSIONS
Related to Auditing to keep track of historical data
When data changes over time such as
Someone moves to a different city
Job title change for someone
Three approaches to take for the purpose
Type 1
History lost
Type 2
Keeps all history
Type 3
Keeps partial history
You can use a combination
For some columns type1 for others type 2
17. TYPE 1
Information got changed, you just update the information. You lose the previous
information . Example as below:
18. TYPE 2 SCD
Here you keep track of all changes. In the example below, to keep track of Occupat
You insert new rows and mark the current position with current field.
Sure, you need to come up with ideas so that primary key constraints do not fail
(you can use a second type of keys called surrogate keys)
You can use date from and date to, to keep track of the changes
For the same dimension for some columns you can use Type 1 for others you
can use type 2
20. TYPE 3
Partial history is kept. In the example only the previous city information is kept
21. THANK YOU FOR BEING WITH US
That’s the end of Dimension Table Design
I may come again with a training video on it
You will see some slides on Fact Table
Design after this slide
I will make another presentation document on
that topic
22. OUR SERVICES
Free Training and Educational Services
Training and Education in Bangla:
Bangla.SaLearningSchool.com
Training and Education in English:
www.SaLearningSchool.com
English.SaLearningSchool.com
http://sitestree.com
Ask a question and get answers:
Ask.JustEtc.net
23. FACT TABLE DESIGN
Fact Table Design Topics
Define fact table column types.
Understand the additivity of a measure.
Handle many-to-many relationships in a Star
schema.
25. FACT TABLE COLUMNS
Measure Column Type
Measure columns help with measurements
useful for a specific business process
Measures columns are usually numeric
And can be aggregated
Measure columns store values that are of
interest to business such as
sales amount, order quantity, and discount amount
26. FACT TABLE COLUMNS
Foreign Key – Column Type
These are the columns as coming from
Dimension Tables
27. DESIGNING FACT TABLES
Fact tables include measures, foreign keys,
and possibly an additional primary key and
lineage columns.
Measures can be additive, non-additive, or
semi-additive.
For many-to-many relationships, you can
introduce an additional intermediate
dimension.
28. Surrogate Key
Usually will comes from the primary dimension
table for the current fact table
Usually one or two columns in a fact table are
surrogate keys
29. SURROGATE KEYS FOR FACT TABLES
OrderId and LineItemId are the
surrogate keys as coming from the
primary Source Order details table
OrderId and LineItemId columns will help
For quick comparisons with source data
Surrogate keys are not a must in fact tables;
however, they help
Must read:
http://www.kimballgroup.com/2006/07/d
esign-tip-81-fact-table-surrogate-key/
30. LINEAGE COLUMNS IN FACT TABLES
Lineage columns –
Just as with dimension tables, these are strictly
for auditing purposes.
References:
https://upsearch.com/implementing-a-data-
warehouse-fact-tables/
31. ADDITIVITY OF MEASURES
The primary purpose of Data warehouse is reporting,
and forecasting ( and analysis in some cases)
Many times reports are aggregations such as sum or
avergae
Example: sales by quarter, by region, by product type,
Many reports are usually aggregation
Hence, fact tables will have some columns to assist
with that measures and aggregation for reporting
These are the measures columns as we discussed
before
The measures that you add will help in how you want
to do the measures and reporting
32. TYPES OF ADDITIVITY OF MEASURES
Types of Additivity of Measures
additive measures
Semi-additive measures
non-additive measures
33. Additive
If a measure can be summed across all dimensions,
it’s referred to as an additive measure.
Semi-additive
Sometimes, however, we can sum a measure across
all dimensions except for time such as account
balance
We can’t sum the account balance across the time
dimension. We would need to do something like take the
average instead, or simply use the last value. Measures
like this are called semi-additive measures.
34. Finally, some measures can’t ever be
summed. These are called non-additive
measures, and include measures like
discount percentages and prices
35. ADDITIVITY OF MEASURES IN SSAS
SSAS has support for semi-additive and non-additive
measures
The SSAS database model is called the Business
Intelligence Semantic Model (BISM). Compared to the
SQL Server database model, BISM includes much
additional metadata.
SSAS has two types of storage:
dimensional and tabular.
Tabular storage is quicker to develop, because it works
through tables like a data warehouse does.
The dimensional model more properly represents a cube.
However, the dimensional model includes even more
metadata than the tabular model.
36. In BISM dimensional processing, SSAS
offers semi-additive aggregate functions out
of the box.
For example, SSAS offers the LastNonEmpty
aggregate function, which properly uses the
SUM aggregate function across all
dimensions but time, and defines the last
known value as the aggregate over time.
37. In the BISM tabular model, you use the Data
Analysis Expression (DAX) language. The
DAX language includes functions that let you
build semi-additive expressions quite quickly
as well.
38. Fact tables
Collection of measurements on a specific
aspects of business
Measure columns
sales amount, order quantity, and discount
amount.