SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Dimensional model | | Fact Tables | | Types
1. DIMENSION / FACT TABLE &
TYPES DATA WAREHOUSE
INTRODUCTION TO DWH
DIMENSIONAL MODEL
DIMENSION TABLE
TYPES OF DIMENSION TABLES
FACTS & TYPES
FACT TABLE
TYPES OF FACT TABLE
KEYS UMAIR SAEED
2. INTRODUCTION TO DWH
• DATA Warehouse is Enterprise data warehouse, system used for reporting
and data analysis also a core component of business intelligence.
• Data Warehouses central repositories of integrated data form more disparate
sources.
• Essentially combine information from several sources into one
comprehensive database.
• Data Warehouse contains data from many operational sources.
• Storage is structured such that users from many divisions and departments
within organization can access and analyze the data according to their needs.
UMAIR SAEED
4. DIMENSIONAL MODEL
• A dimensional model is a database structure that is optimized for online
queries and Data Warehousing tools.
• Dimensional model is designed to read, summarize, analyze numeric
information like values, balances, counts, weights, etc. in a data warehouse.
• It is comprised of “Facts” and “Dimension” tables.
• A “Fact” is a numeric value that a business wishes to count or sum.
• A “Dimension” is essentially an entry point for getting at the facts.
• EXAMPLES: STAR Schema, Snow Flake Schema etc.
UMAIR SAEED
5. DIMENSION TABLE
• Describes the objects present in the Fact table.
• Dimension tables’ help to describe dimensions i.e. dimension values,
attributes and keys.
• Dimension Table refers to the collection or group of information related to
any measurable event.
• They form a core for dimensional modelling.
• Dimension tables are de-normalized tables.
• It is being joined with the fact tables through foreign key.
UMAIR SAEED
6. DIMENSION TABLE
WHY DO WE NEED TO USE?
• Its help to store dimensional information.
• Its is easy to understand than the normalized tables.
• More columns can be added to the table without affecting the existing
applications that are using those.
UMAIR SAEED
10. SCD (SLOWLY CHANGING DIMENSION)
• Attributes of a dimension that would undergo changes over time.
• Dimension attributes that change slowly over a period of time rather than
changing regularly is grouped as SCDs.
• Attributes like name, address can change but not too often.
• Consider an example where a person is changing from one city to another.
Now there are 3 ways to change the address.
• Type 1 is to over write the old value, Type 2 is to add a new row and Type 3
is to create a new column.
UMAIR SAEED
11. SCD (EXAMPLE)
• A man who travels to different countries so he needs to change his
address according to that country. This can be done in three ways:
• Type 1:
Overwrite the previous value. This method is easy to apply and helps to save space
hence reduce cost. But, history is lost in this scenario.
UMAIR SAEED
12. SCD (EXAMPLE)
• Type 2:
• Add a new row with the new value.
• In this method, the history is saved and can be used whenever necessary. But it takes
large space hence increases the cost.
UMAIR SAEED
13. SCD (EXAMPLE)
• Type 3:
• Add a new column. It is the best approach as history can be maintained easily.
UMAIR SAEED
14. CONFORMED DIMENSIONS
• A dimension that is used in multiple locations
• This dimension is shared among multiple subject areas or data marts.
• Same can be used in different projects without any modifications done
in the same.
• This is used to maintain consistency.
• Conformed dimensions are those which are exactly same or a proper
subset of any other dimension.
UMAIR SAEED
16. JUNK DIMENSION
• A junk dimension is a group of attributes of low cardinality.
• It contains different or various attributes which are unrelated to any other
attribute.
• These can be used to implement RCD (rapidly changing dimension) such as
flags, weights etc.
• It is a single table with a combination of different and unrelated attributes to
avoid having a large number of foreign keys in the fact table.
• They are often created to manage the foreign keys created by rapidly changing
dimensions.
UMAIR SAEED
18. DEGENERATE DIMENSION
• A degenerate dimension is when the dimension attribute is stored as part of
fact table, and not in a separate dimension table.
• These are essentially dimension keys for which there are no other attributes.
• It attributes which are stored in the fact table itself and not as a separate
dimension table, those attributes are called degenerate dimension.
• For e.g. ticket number, invoice number, transaction number etc.
UMAIR SAEED
20. ROLE PLAYING DIMENSION
• The having multiple relationships with the fact table are called role-play
dimension.
• In other words, it is when the same dimension key with all its related
attributes is joined to many foreign key presents in the fact table.
• It can fulfil multiple purposes within the same existing database.
• For example, a fact table may include foreign keys for both ship date and
delivery date. But the same date dimension attributes apply to each foreign
key, so you can join the same dimension table to both foreign keys.
UMAIR SAEED
22. FACT
• A fact table stores quantitative information for analysis and
is often de-normalized.
• fact table holds the measures, metrics and other quantifiable
information
• Facts are also known as measurements or metrics.
UMAIR SAEED
23. FACTS TYPES
• There are three types of Facts
1. ADDITIVE FACTS:
Additive Facts can be used with any aggregation function like SUM (), AVG (), etc.
Example: Quantity, Sales amount etc.
2. SEMI-ADDITIVE FACTS:
Semi-Additive Facts are those where only a few of aggregation function can be applied.
Example: consider Bank account details. You cannot apply the SUM (), on the bank
balance that does not give useful results but MIN () and MAX () function may return useful
information.
3. NON-ADDITIVE FACTS:
You cannot use numeric aggregation functions such as SUM (), AVG () etc,On Non-
Additive Facts
UMAIR SAEED
24. FACT TABLE
• A Fact table is nothing but the table that contains all the facts or the
business information, which can be subjected to analysis and reporting
activities.
• These tables hold fields that represent the direct facts.
• foreign fields that are used to connect the fact table with other dimension
tables in the Data Warehouse system.
• A Data Warehouse system can have one or more fact tables, depending on
the model type used to design the Data Warehouse.
UMAIR SAEED
26. CHARACHTERISTICS OF FACTS TABLE
• KEYS: It has a key or a primary key which is the accumulation of
all the primary keys of all dimension tables linked with it. That
key is known as a concatenated key that helps to uniquely
identify the row.
• FACT TABLE GRAIN: Grain of a table depicts the level of the detail or the
depth of the information that is contained in that table. More the level, more
the efficiency of the table.
• ADDITIVE MEASURE: In this types of measures are handled.
• SPARSE DATA: There are records that have attributes containing null
values or measures. They provide no information.
• SHRUNKEN ROLLUP DIMENSION: Shrunken Rollup dimensions are
the subdivisions of the base dimension.
UMAIR SAEED
28. TYPES OF FACT TABLES
• TRANSACTION FACT TABLE
This is a fundamental and basic view of business operations.
It is used to represent an occurrence of an event at any instantaneous point of time.
The facts measure are valid only for that particular instant and only for that event.
The grain which is associated with the transaction table specifies as “one row per line
in a transaction”.
Usually, it contains the data of the detailed level, which leads it to have a large number
of dimensions associated with it.
It captures the measurement at the most basic or atomic level of dimension.
This helps the table to give robust dimensional grouping, roll up & drill-down
reporting capabilities to the users. UMAIR SAEED
30. SNAPSHOT FACT TABLE
• The snapshot gives the state of things at a particular instance of time or
“picture of the moment”.
• It normally includes more non- additive and semi-additive facts.
• the performance of an activity at the end of each day or a week or a month
or any other time interval is represented
• unlike the transaction fact table where a new row is added for the
occurrence of every event.
• dependent on the transaction fact table to get the detailed data present in
the transaction fact table.
UMAIR SAEED
32. ACCUMULATING FACT TABLE
• These are used to represent the activity of any
process that has a well defined and clear
starting and end.
• Accumulating snapshots mostly have multiple
data stamps that represent the predictable
phases.
• Sometimes there is an extra column containing
the date that shows when the row was last
updated.
UMAIR SAEED