Dimensional Modeling Concepts_Nishant.ppt

A Beginners’ Guide
By Nishant Gupta

A data warehouse is a relational database that is a collection of large
amount of data designed for reporting and analysis purpose. It keeps
historical data from various source separating the analytical data from the
transactional data enabling its enterprise wide consolidation . The main
source of the data is cleaned, transformed and cataloged using
extraction, transportation, transformation, and loading (ETL) solution for
online analytical processing (OLAP), data mining capabilities, client
analysis tools, and other applications that manage the process of
gathering data and delivering it to business users.

A data warehouse is a subject-oriented, integrated, time-variant and non-
volatile collection of data in support of management's decision making
process.
- W.H.Inmon
A data warehouse is a copy of transaction data specifically structured for
query and analysis.
- Ralph Kimball

Online transaction processing, or OLTP, are those applications which
caters to daily transactional needs of a business. In these kind of systems
faster turnaround/response time is the key for data storing & retrieval point
of view.
Online analytical processing, or OLAP , are those systems which are
used for decision making & analytics purpose. Here throughput is more
critical than response time to generate results for ad hoc queries, business
intelligence ,relational reporting ,data mining & forecasting/budgeting
sake.

OLTP OLAP
Data Source Operational data Consolidated data from various OLTP sources
Purpose of data Daily Business transactions For Decision support, planning, reporting &
analysis
Insert & Updates Short & Fast Periodic long running batch jobs
Queries Simple queries that returns few
records
Often complex or ad hoc queries involving
aggregations
Processing Speed Very Fast Depends on volume of data
Space Requirements Less Large
Database Design Highly normalized De normalized
Back Up & Recovery Regular back ups & proper
recovery plans/policies
In frequent back up policy

Data Ware House Architecture
OS
OS
Raw Data
Summary
Data
Metadata
Inventory
Sales
Purchasing
Flat Files
STG DB
Analyti
cs
Report
ing
Mining
Data Source Staging Ware House Data Marts Users

Operational Source : These are basically those operational systems
which are used for business transaction purpose . It resides outside the
realms of datawarehosuing to provide performance efficiency &
availability to cater to routine business operations/transactions.
Data Staging Area: This is basically the places used both for data
storage & ETL processes such as cleansing the data (correcting
misspellings, resolving domain conflicts, dealing with missing elements,
or parsing into standard formats),combining data from multiple sources,
de duplicating data, and assigning warehouse keys. It should be off limit
to business user & may consist normalized structure .
Data Presentation Area(Mart): The data presentation area is where
data is organized, stored, and made available for direct querying by
users, report writers, and other analytical applications. It consists of a
dimensional model which is atomic & simple enabling performance-
enhancing summary data, or aggregates .

Data Access Tools: It is basically the variety of abilities provided to
ends user to query or consume the data ware house data for analytical &
decision making purpose. It could either be a simple ad hoc query or
data mining tool or any reporting application.
Metadata: It is not the actually the operational data but the configuration
data necessary for the general functioning of our data warehouse. It may
be system tables, partition settings, indexes, view definitions,
numerator/denominator definitions, Inclusion /exclusion definitions and
DBMS-level security privileges and grants ,business names and
definitions for the Data Mart tables and columns as well as constraint
filters, application template specifications, access and usage statistics,
and other user documentation.
Operational Data Store: An ODS is implemented to deliver adequate
operational reporting, especially when neither the legacy nor on-line
transaction processing (OLTP) systems provides them. Performance-
enhancing aggregations, significant historical time series, and extensive
descriptive attribution are specifically excluded from the ODS.

Dimensional Modeling is a technique often associated with logical
designing of a Data ware houses. It is a modeling method which
primarily consists of Facts & Dimensions. It has following main features :
 It is primarily query oriented
 It is structured according to data usage than the business rules/needs
 It is organized into base facts ,dimensions of those facts & look ups of
those dimensions
 It is based on identification of key grains of data and their
characteristics It usually comprises snapshot, grouped or Summary data
 It normally has less number of joins & its depth
 It is more extensible in nature with new data being easily
accommodated without changing the existing structure or query

Dimensional Models can be organized into various schemas like :
 Star Schema
 Snowflake Schema
 Constellation
Star Schema : It is the simplest form of the Data ware house schema .In
this one or more fact table is connected with multi dimensional tables
resembling the star formation. The center of the star consists of a large
fact table and the points of the star are the dimension tables. The
primary key in each dimension table is related to a foreign key in the fact
table. In other words, they all have the same level of granularity.

Benefits:
It provides a direct & simple mapping between the business entities and
the schema design.
It provides highly optimized query performance due to simple joins
between one Fact & Dimensions
It is widely supported by various BI tools
Product Dim
Time Dim
Sales Fact
Location Dim
Customer Dim

Snow Flake Schema is an extension of the Star schema where each point
of star is further explodes into more points . In other words, the
dimensional tables are further normalized(3rd Normal Form) into multiple
related look up tables each representing a level in the dimensional
hierarchy. The Snow flake schema derives its name due to its resemblance
to the shape of a real Snowflake . The "snow flaking" effect only affects
the dimension tables and not the fact tables.
Pros:
 It eliminates redundancy
 It requires less disk space for storage
 It represents the real world scenario in schema design
 It aids in transactional reporting via Data warehouse
Cons:
It requires additional maintenance effort due to increase in number of
look up tables
It increase number of joins resulting in poor performance of data retrieval

Product Dim
Time Dim
Sales Fact
Location Dim
Customer Dim
Product
Category Look
Up
Month Look
Up
Customer
Type Look Up
State Look Up

Fact Constellation Schema as name suggested is a group of fact
tables sharing multiple dimensions between each other representing
shape similar to like a constellation of stars (i.e., star schemas).It is
pretty complex in nature hence should be used for applications which are
highly sophisticated & complicated in nature.
Product
Dim
Time Dim
Sales Fact
Location
Dim
Customer
Dim
Shipping
Fact
Shipper
Dim
Time Dim

As per Ralph Kimball the following mentioned four steps are the back
bone of any dimensional design process :
Select the business process to model
Declare the grain of the business process
Choose the dimensions that apply to each fact table row
Identify the numeric facts that will populate each fact table row

Firstly we need to select the business process for which the
dimensional model needs to be designed. A business process may
require more than one dimensional model. A business process is a set of
related activities shared across line of services or business departments.
To identify the business processes of a dimensional model, we collect
the following metadata:
Business requirements & processes
Stakeholders
Source systems
Data quality related issues
Business process related glossary
Other business-related metadata

Next step is to identify & declare the grain of the of the model. The
grain of a table represents the most atomic level by which the tables may
be defined.
Preferably we should develop dimensional models for the most atomic
information captured by a business process. Atomic data is the most
detailed information collected; such data cannot be subdivided further.
Atomic data is highly dimensional & gives us the capability to drill down
to the lowest level of details. It really helps in slicing & dicing of the
data.
For example the grain of a SALES fact table might be stated as "Sales
volume by Day by Product by Store". Each record in this fact table is
therefore uniquely defined by a day, product and store.

Dimensions are basically used to describe the business entities of an
enterprise often composed of one or more hierarchies, that categorizes
data. It contains the textual descriptor of business used for filtering,
grouping & labeling. Dimension data is typically collected at the lowest
level of detail and then aggregated into higher level totals that are more
useful for analysis. These natural rollups or aggregations within a
dimension table are called hierarchies.
In other words, dimension tables contain attributes that describe fact
records in the fact table. Some of these attributes provide descriptive
information; others are used to specify how fact table data should be
summarized to provide useful information. In any case it must contain
one primary key used to uniquely identify each records in dimension
table, e.g. :
Product Dimension > Location Dimension
Time Dimension > Customer Dimension

There are various type of dimensions available, such as,
 Conformed Dimension
 Junk Dimension
 Degenerated Dimension
 Role playing Dimension
 Slowly changing Dimension
 Rapidly changing Dimension

Conformed dimensions are those which are either identical or a perfect
subset of the most granular ,detailed dimension. Conformed dimensions
have :
 Consistent dimension keys
 Consistent attribute column names
 Consistent attribute définitions
 Consistent attribute values
Dimension tables are not conformed if the attributes are labeled
differently or contain different values. In case the dimensions like
Product or Customer are deployed in on conformed manner then
different Data Marts can’t be merged or used together.

The various flavors of Conformed dimensions are :
 Exactly same dimensions joined with every possible fact tables across
data marts
 Conformed dimensions at a rolled up level of granularity like
maintaining weekly inventory snapshot along with daily snapshot. In
another situation like sales & forecasting facts are maintained at atomic
product level & brand level respectively. Roll-up dimensions conform to
the base-level atomic dimension if they are a strict subset of that atomic
dimension
Product Dimension
Product Key
Prod Description
Brand Description
SKU Number
Category
Brand Dimensions
Brand Key
Description
Category
Conforms

 Conformed dimensions subsets at the same granularity.
Appliance
Products
Apparel
Products
Enterprise Product Dimension
Drilling across (conforming)

In many scenarios ,in the process of identifying dimensions out of
transactional system tables comprise of many miscellaneous indicators
and flags, each of which takes on a small range of discrete values that
can’t b included in the dimensions resulting in very large fact tables. By
creating an abstract dimension, these flags and indicators are removed
from the fact table & placed into a dummy dimension. These dimensions
are called Junk Dimensions.
For e.g. , if we remove 10 two-value indicators, such as the cash
versus credit payment type, from the order fact table and place them into
a single dimension i.e. junk dimension with 1024 rows(2) with a single
small surrogate key included in the fact table.
Order Indicator
Key
Payment
Type
Payment Type
Group
Inbound
/Outbound
Commission Credit
Indicator
Order Type
Indicator
1 Cash Cash I C Regular
2 Cash Cash I N Display
3 Cash Cash O C Regular

A Degenerated dimension is a data dimension that is although
dimensional in nature but stored in fact table It doesn’t have any
separate dimension to join & one can use it to slice & dice the measures
in fact table.
For e.g., dimension key, such as a transaction number, invoice number,
ticket number, or bill-of-lading number, that has no attributes but is used
to provide a direct reference back to a transactional system without the
overhead of maintaining a separate dimension table.
Sales Fact table
POS Transaction No (DD)
Product ID (FK)
Date Key(FK)
Store ID (FK)
Sales quantity
Gross Profit
Cost ($)
Selling Price($)
Promotion Key(FK)
Product Dim
Date Dim
Store Dim
Promotion
Dim

A role-playing in a data warehouse occurs when a single dimension
simultaneously appears several times in the same fact table. The
underlying dimension may exist as a single physical table, but each of
the roles should be presented as a separately labeled view. For e.g., the
Date dimension can be used for the ordered date, scheduled shipping
date, shipment date, and invoice date in an order line fact or in a
Insurance domain a Customer dimension can be used as nominee,
proposer & beneficiary in a Policy Detail Fact . Order Transaction Fact
Order Date Key(FK)
Shipped Date(FK)
Invoice Date (FK)
Order No(DD)
Order quantity
Product Key(FK)
Total Amount
Discount Amount
Net Order Amount
Order Line No(DD)
Date Dimension
Product Dimension

A characteristic of dimensions is that its data is relatively static—data
may be added as new record, but data, as such changes infrequently .
Slowly Changing Dimensions (SCDs) are dimensions that have data that
changes slowly over a period of time , rather than being time barred or
scheduled.
To track changes of these dimensions are more dependent on business
needs & can be achieved through various ways as per the requirement.
The technique/methodology to handle or manage SCDs is termed as
Type 0 to Type 6.
Types of SCD’s
Type 0 : It is an approach in which the SCD is maintained in the same
form as it is created & the changes to the existing records are ignored. It
is a passive approach of tracking the dimension value changes .

Type 1: Overwrite the Value
With the type 1, we overwrite the old attribute value in the dimension
row, replacing it with the current value. In so doing, the attribute always
reflects the most recent assignment. This is most appropriate for
rectifying the certain type of data errors like misspelling of the name
,address etc or no value in keeping old description.
Example : Supplier state changes from CA to NY.
Old
New
Pros : Easy to maintain & fast.
Cons: No history can be kept & the re calculation or loading of the
aggregate fact table based on state.
Supplier_Key Supplier_Code Supplier Name Supplier State
123 XYZ Max Trading CA
Supplier_Key Supplier_Code Supplier Name Supplier State
123 XYZ Max Trading NY

Type 2: Add a dimension row
The Type 2 method tracks historical data by creating multiple records for
a given business key in the dimensional tables with separate surrogate
keys with effective date time and/or different version numbers or . Using
Type 2, we can keep the entire history of a records because a new
record is inserted each time a change is made.
The type 2 response is the primary technique for accurately tracking
slowly changing dimension attributes. It is extremely powerful because
the new dimension row automatically partitions history in the fact table.
Example : Supplier state changes from CA to NY
Supplier_
Key
Supplier_C
ode
Supplier
Name
Supplier
State
Version
123 XYZ Max Trading CA 0
123 XYZ Max Trading NY 1
Type 2 : Versioning

Pros :
History can be kept & tracking of unlimited dimension changes
It perfectly segments fact table history because pre change fact rows
use the pre change surrogate key.
Cons:
It contributes in rapid growth of dimension tables
Database operations like joins are expensive
Supplier_K
ey
Supplier_C
ode
Supplier
Name
Supplier
State
Start Date End Date
123 XYZ Max Trading CA 01/11/2009 07/22/2010
123 XYZ Max Trading NY 07/23/2010
Type 2: Effective Date Stamp Method

Type 3 : Add a Dimension Column
Type 3 solutions track changes horizontally in the dimension table by
adding new fields to contain the old data. Often only the original and
current values are retained and intermediate values are discarded but at
times we also retain the previous & current values only. This kind of type is
used where we want to compare the current data values with original or
previous data values in one go for the purpose like sales force
reorganizations etc.
Pros: Avoidance of multiple dimension records for single entity.
Cons: Limited history tracking & more complex queries to access the old
values.
Supplier_Key Supplier_Code Supplier Name Original Supplier
State
Current Supplier
State
123 XYZ Max Trading CA NY

Predictable Changes with Multiple Version Overlays :
In some cases we need to retain the history for 4-5 years or times where
the number is known to us or we can predict the change for a given
attribute .It is an extension to Type 3 where we can keep on adding the
column for a changing attribute as & when it changes while keeping the
latest data value in the current value column. For ,e.g., Sales rep district
got revised every year or two then we can design the Sales Rep Dim table
(as shown in figure). If his District changes
later then we need to only add another
column with label District for 2010 with value
from current district & overwrite the
current district with new value.
Sales Representative Dimension
Name
Key
Address
Current Sales District
District 2009
District 2008
…….

Unpredictable Changes with Single-Version Overlay:
It basically combines the approach of all type 1,2 & 3 into single type .
This make sense when one has to preserve historical accuracy
surrounding unpredictable attribute changes while supporting the ability to
report historical data according to the current values. This is only possible
by clubbing all the 3 types together as shown below :
In this new dimension row for Supplier, the current state will be identical to
the historical state. For all previous instances of that Supplier dimension
rows, the current state attribute will be overwritten to reflect the current
structure.
Supplier
_Key
Supplier
_Code
Supplier Name Historical S -
upplier State
Current
Supplier State
Current
Flag
Start
Date
End
Date
123 XYZ Max Trading CA IL N
124 XYZ Max Trading NY IL N
125 XYZ Max Trading IL IL Y

A rapidly changing dimension is a dimension if one or more of its
attributes changes frequently in many rows. For a rapidly changing
dimension, the dimension table can grow very large from the application
of numerous Type 2 changes.
For e.g. , lets take a case of Customer dimension with 10K records
with 10 change per customer per year will result in 500k records in 5
years which is acceptable but consider any financial or insurance
organization where not only changes but also the customer base is
huge that could result in addition of multi million record over the period
of time resulting in rapidly changing monster dimension problem like
browsing performance & change tracking challenges.

The solution is to break off frequently analyzed or frequently changing
attributes into a separate dimension, referred to as a minidimension &
track them as band. A separate dimension of variable valued
demographic attributes, such as age, gender, number of children, and
income level, presuming that these columns get used extensively. There
would be one row in this minidimension for each unique combination of
age, gender, number of children, and income level encountered in the
data, not one row per customer (refer figure below).Business needs will
determine which continuously variable attributes are suitable for
converting to bands.
Demographic Key Age Gender Income Level
1 20-24 M 0-$20000
2 20-24 M $20000-$24999
3 20-24 F 0-$20000
4 25-29 M 0-$20000
5 25-29 F 0-$20000

Every time we build a fact table row, we include two foreign keys related
to the customer: the regular customer dimension key and the
minidimension demographics key. As shown in Figure below, the
demographics key should be part of the fact table’s set of foreign keys in
order to provide efficient access to the fact table through the
demographics attributes.
Customer Dim
Customer Key (PK)
Customer ID (NK)
Customer Name
Address
DOB
…….
Age
Gender
Income
No of Children
Customer Dim
Customer Key (PK)
Customer ID (NK)
Customer Name
Address
……….
Cust Demo Dim
C Demo Key (PK)
C Age Band
Gender
C Income Band
becomes
Fact Table
Customer Key (FK)
C Demo Key (FK)
More Foreign
Keys…..
Facts…….

The minidimension terminology refers to when the demographics key is
part of the fact table composite key; if the demographics key is a foreign
key in the customer dimension, we refer to it as an outrigger which has
to be a Type 1 attribute
The best approach for efficiently browsing and tracking changes of key
attributes in really huge dimensions is to break off one or more
minidimensions from the dimension table, each consisting of small
clumps of attributes that have been administered to have a limited
number of values
To store exact values instead of the bands or ranges sometimes one
need to create fact less schema that focuses on attribute changes. In
this case the dimension & minidimension are connected via a dummy
fact table which only consists the keys from various dimension table but
no numeric measurement values

A fact table is the primary table in a dimensional model where the numerical
performance measurements of the business are stored. It basically consists of two
types of columns : one those contain measurements & the other which are foreign
key to dimensional tables. This list of dimensions defines the grain of the fact table
and tells us what the scope of the measurement is.
A row in a fact table corresponds to a measurement. A measurement is a
row in a fact table. All the measurements in a fact table must be at the same
grain.
Also Fact tables express the many-to-many relationships between
dimensions in dimensional models.
The primary key of a fact table is usually a composite key that is made up of all of
its foreign keys. Fact tables contain the content of the data warehouse and store
different types of measures like additive, non additive, and semi additive measures.

On the basis of measure :
Additive
Semi additive
Non additive
Fact less fact or Junction Fact
On the basis of measurement events:
Transactional snapshots
Periodic snapshots
Accumulating snapshots

An Additive facts are the measurements that can be summed up
through all of the dimensions in the fact table. The most useful facts in
a fact table are numeric and additive. In the figure, three of the facts,
sales quantity, sales dollar amount, and cost dollar amount, are
beautifully additive across all the dimensions. We can slice and dice the
fact table with impunity, and every sum of these three facts is valid and
correct. Retail Sales Transactions Fact
Date Key(FK)
Product Key (FK)
Store Key(FK)
Txn No
Sales Amount($)
Sales Quantity
Cost Amount($)
Gross Profit($)
Gross Margin(%)

A Semi Additive facts are the measurements that can be added across
few of the dimensions but not against all in the fact table. In the figure,
Bill amount is a semi additive fact because it can be added up for a date
or Store key to arrive at the total sales amount for the day or for a given
store but adding up against Product key doesn’t make any sense.
Retail Sales Transactions Fact
Date Key(FK)
Product Key (FK)
Store Key(FK)
Txn No
Sales Amount($)
Sales Quantity
Cost Amount($)
Gross Profit($)
Gross Margin(%)
Bill Amount($)

A Non Additive facts are the measurements that can not be added
across any of the dimensions in the fact table. Percentages and ratios,
such as gross margin, are non additive. The numerator and denominator
should be stored in the fact table. The ratio can be calculated in a data
access tool for any slice of the fact table by remembering to calculate the
ratio of the sums, not the sum of the ratios. In the figure, Unit Price &
Gross margin are non additive facts because the can’t be summarized
along any dimension.
Retail Sales Transactions Fact
Date Key(FK)
Product Key (FK)
Store Key(FK)
Txn No
Sales Amount($)
Gross Profit($)
Gross Margin(%)
Unit Price($)

A fact less facts are those tables that doesn’t have any measurement
metrics , it merely captures the relationship between the involved
dimension key. A fact table that has no facts but captures certain many
to-many relationships between the dimension keys. Most often used to
represent events or provide coverage information that does not appear in
other fact tables. For e.g. , to determine what products where on
promotion but didn’t sell requires a separate promotion coverage fact
table, we’d load one row in the fact table for each product on promotion
in a store each day (or week, since many retail promotions are a week in
duration) regardless of whether the product sold or not. This table is fact
less fact due to absence of any measurements.
Promotion Coverage Fact
Date Key(FK)
Product Key (FK)
Store Key(FK)
Promotion Key (FK)
Product Dim
Time Dim
Store Dim
Promotion Dim

Facts from multiple fact tables are conformed when the technical
definitions of the facts are equivalent. Conformed facts are allowed
to have the same name in separate tables and can be combined and
compared mathematically. If facts do not conform, then the different
interpretations must be given different names.

A transactional snapshot fact table represents a point of time in the
life of business events. A row exists in the fact table for a given
dimension only if a transaction event occurred. Transactional fact table
holds data of the most detailed level, causing
it to have a great number of dimensions
associated with it. Once a transaction has
been posted, needn’t be revisited.
Transactional Fact Table
Txn No
Item ID (FK)
Bill No (DD)
Count of Item
Cost
Profit
Selling Price
Store ID (FK)
Date Key(FK)
Tax Amount
Discount
Promotional Code (FK)

A Periodic snapshot fact represents a pre-defined interval or a period .
Unlike the transaction fact table, with the periodic snapshot, we
take a picture (hence the snapshot terminology) of
the activity at the end of a day, week, or month,
then another picture at the end of the next period,
and soon. The periodic snapshots are stacked
consecutively into the fact table. Daily snapshots
and monthly snapshots are common. A separate
record is placed in a periodic snapshot fact table
each period regardless of whether any activity has taken place in the
underlying transaction.
Periodic Fact Table
Item ID (FK)
Store ID (FK)
Date Key(FK)
Cost ($)
Last Selling Price($)
Quantity on hand
Quantity Sold

An accumulating snapshot fact table represents business activities
over a time period . Accumulating snapshots almost
always have multiple date stamps, representing the
predictable major events or phases that take place
during the course of a lifetime. Often there’s an
additional date column that indicates when the
snapshot row was last updated. Since many of these
dates are not known when the fact row is first loaded,
we must use surrogate date keys to handle undefined
dates. The fact table is revisited and updated as
activity occurs. A record is placed in an Accumulating
snapshot fact table just once, when the item that it
represents is first created rest of the time it is just
updated with new date values.
Accumulating Snapshot
Fact Table
Txn No
Order Date Key (FK)
Backlog Date Key (FK)
Release to Manufacturing
Date Key (FK)
Finished Inventory
Placement Date Key (FK)
Requested Ship Date Key
(FK)
Scheduled Ship Date Key
(FK)
Actual Ship Date Key (FK)
Arrival Date Key (FK)
Invoice Date Key (FK)
……..

A table with a multipart key capturing a many-to-many relationship
that can’t be accommodated by the natural granularity of a single fact
table or single-dimension table. Serves to bridge between the fact table
and the dimension table in order to allow many-valued dimensions or
ragged hierarchies. Sometimes referred to as a helper or associative
table. When using a bridge table, the facts in the fact table are multiplied
by the bridge table’s weighting factor to appropriately allocate the facts to
the multivalued dimension. It is called Weight Report & if we exclude this
then it turns out to be Impact report.
Diagnosis Group
Bridge
Diagnosis Group
Key(FK)
Diagnosis Key(FK)
Weighted Factor
Diagnosis
Group Dim
Diagnosis
Group(PK)
Health Care
Billing Line Item
Fact
Diagnosis
Dimension

Hierarchies are logical structures that use ordered levels as a means
of organizing data. A hierarchy can be used to define data
aggregation(lower levels aggregated & rolled up to higher levels).
For example, in a time dimension, a hierarchy might aggregate data
from the month level to the quarter level to the year level. A
hierarchy can also be used to define a navigational drill path and to
establish a family structure.
A level represents a position in a hierarchy
Level relationships specify top-to-bottom ordering of levels from
root to leaves node.
Each level is logically connected to the levels above(parent) and
below(children) it.
A dimension can be composed of more than one hierarchy. For
example, in the product dimension, there might be two hierarchies--
one for product categories and one for product suppliers.

Balanced
Un balanced
Ragged
Parent child relationship
CEO
CIO COO
Fin Head
IT Head
Emp 2
HR Head
Emp 1 Emp 3 Emp 4

In balanced hierarchies (balanced/standard), the branches of the
hierarchy all descend to the same level, with each member's parent
being at the level immediately above the member. They are
consistent because each level represents the same type of
information, and each level is logically equivalent . An common
example of a balanced hierarchy is one that represents time, where
the depth of each level (year, quarter, and month) is consistent.
2009
1st Quarter
Jan Mar
Feb
2010
1st Quarter
Jan Mar
Feb

Unbalanced hierarchies include levels that have a consistent parent-
child relationship, but have logically inconsistent levels. The
hierarchy branches can also have inconsistent depths. For e.g., the
organizational structure is unbalanced, with some branches in the
hierarchy having more levels than others. In an unbalanced
hierarchy, null values can appear on the lower levels of the
hierarchy.
CEO
CIO COO
IT Head
Emp 2
Admin 1
Emp 1
CFO
Admin 2
Fin 1 Fin 2
Emp 3

In a ragged hierarchy, the logical parent member of at least one
member is not in the level immediately above the member. This can
cause branches of the hierarchy to descend to different levels. A
ragged hierarchy can represent a geographic hierarchy in which the
meaning of each level such as city or country is used consistently,
but the depth of the hierarchy varies
North America
Athens
Greece
Europe
US
CA
San Francisco

A parent-child hierarchy is a hierarchy with multiple levels that track
the relationships within the hierarchy. A single table or view is used
that represents the parent-child hierarchy. A view can be used to
flatten the structure in case this kind of hierarchy used multiple
tables . The top level uses the parent key as the level key, whereas
the bottom level contains the child key. For example, in a hierarchy
that represents an organizational structure, you can have two levels:
Manager and Employee. The Manager level is the parent level, and
the Employee level is the child level

Dimensional Modeling Concepts_Nishant.ppt

Dimensional Modeling Concepts_Nishant.ppt

Recommended

Recommended

More Related Content

Similar to Dimensional Modeling Concepts_Nishant.ppt

Similar to Dimensional Modeling Concepts_Nishant.ppt (20)

Recently uploaded

Recently uploaded (20)

Dimensional Modeling Concepts_Nishant.ppt