Data warehousing

The Organisation As A System An information management framework The Performance Organiser Data Warehousing

Data Warehousing The Performance Organiser A data warehouse is a repository of an organization's electronically stored data, designed to facilitate reporting and analysis. A data warehouse is sometimes referred to as a “data mart”.

Data Warehousing The Performance Organiser Perhaps the two most well know forms of data stored in a data warehouse are: Datebases Data stored in rows and columns and related tables as a database Document Folders 01 -Design 02 -Accounts 03 - Production Or a series of files, in multiple formats stored in a directory structure

Data Warehousing The Performance Organiser While both can be analysed and analysis tools exist to search and collate each of them, the sheer volume of data contained in either or both, can turn any analysis effort into a complex and time consuming exercise.

Data Warehousing The Performance Organiser As a consequence, there is a need for a third type of data storage that provides the means to store the analysis results of the bulk of data but also gives the the means to “drill down” into the main data stores if required.

Data Warehousing The Performance Organiser Datebases Document Folders 01 -Design 02 -Accounts 03 - Production That third form is known as the “Fact Table” and enables the concept of “On Line Analytical Processing”

Data Warehousing The Performance Organiser A fact table consists of the measurements, metrics or facts of a business process. Fact tables have their own structure or schema. Often, when drawn, their schema takes the shape of a star, or snowflake, with the fact table surrounded by dimension tables, which as mathematically based summaries of main data tables.

Data Warehousing The Performance Organiser Fact tables provide the (usually) additive values that act as independent variables by which dimensional attributes are analyzed. Fact tables are often defined by their grain . The grain of a fact table represents the most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a day, product and store. Other dimensions might be members of this fact table (such as location/region) but these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional slices of the independent facts but generally provide insights at a higher level of aggregation (a region contains many stores).

Data Warehousing The Performance Organiser Additive - Measures that can be added across all dimensions. Non Additive - Measures that cannot be added across all dimensions. Semi Additive - Measures that can be added across few dimensions and not with others.

Data Warehousing The Performance Organiser A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables). Special care must be taken when handling ratios and percentage. One good design rule is to never store percentages or ratios in fact tables but only calculate these in the at the business of presentational level. Thus only store the numerator and denominator in the fact table, which then can be aggregated and the aggregated stored values can then be used for calculating the ratio or percentage at the business logic or presentational level.

Data Warehousing The Performance Organiser Fact table design approach: Identify a business process for analysis (like sales). Identify measures or facts (sales value), by asking questions like what ‘number of’ XX are relevant for the business process (Replace the XX, and test if the question makes sense business wise). Identify dimensions for facts (product dimension, location dimension, time dimension, organization dimension), by asking questions which makes sense business wise, like 'Analyse by' XX, where XX are replaced with the subject to test. List the columns that describe each dimension (region name, branch name, business unit name). Determine the lowest level (granularity) of summary in a fact table (e.g. sales).

Data Warehousing The Performance Organiser If the business process is SALES, then the corresponding fact table will typically contain columns representing both raw facts and aggregations in rows such as: £12,000 , being "sales for A store for 15-Jan-2005" £34,000 , being "sales for B store for 15-Jan-2005" £22,000 , being "sales for C store for 16-Jan-2005" £50,000 , being "sales for D store for 16-Jan-2005" £21,000 , being "average daily sales for A for Jan-2005" £65,000 , being "average daily sales for B Store for Feb-2005" £33,000 , being "average daily sales for C Store for year 2005" "average monthly sales" is a measurement which is stored in the fact table.

Data Warehousing The Performance Organiser The fact table also contains foreign keys from the dimension tables, where time series (e.g. dates) and other dimensions(e.g. store location, salesperson, product) are stored. All foreign keys between fact and dimension tables should be surrogate keys, not reused keys from operational data. The centralized table in a star schema is called a fact table. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the content of the data warehouse and store different types of measures like additive, non additive, and semi additive measures.

Data Warehousing The Performance Organiser Fact table data provides the primary data feed for kpi reporting and monitoring. From KPI’s come the status indicators for higher level monitoring mechanisms like scorecards and dashboards.

The Performance Organiser Data Warehousing Single KPI Dashboard Current achievable mean = 22 Possible Achievable mean = 28 Flag state = Green Qualitative Quantitative Achievable mean Achievable Best Worst Time Qualitative or Quantitative Scale J F M A M J J A S O N D 12 36 12 48 23 12 11 36 12 88 23 12 16 32 27 27 15 19 19 45 41 41 For each indicator provide additional documentary evidence

Data Warehousing The Performance Organiser No of widgets Produced No of widgets unfit for purpose

Data Warehousing The Performance Organiser The collation and summary of facts from main table data will mean running additional processes (typically out of normal working hours) which in turn will mean a time delay between the collation exercise and its readiness for delivery at the presentation or dashboard level. However, the speed of response for reporting purposes will be greatly enhanced

Data Warehousing The Performance Organiser A data warehouse typically consists of three data forms. Two, the databases and document libraries contain the bulk of an organisations data. The third form, Fact Tables, contain summary data, usually of the database content, the primary function of which is to provide accurate, timely analysis. Fact tables should provide the primary reporting source for kpis. Datebases Document Folders 01 -Design 02 -Accounts 03 - Production

Data Warehousing The Performance Organiser While fact tables present their own information management issues, they are one of the key tools in an information managers armoury that facilitates decision support. Fact tables can be further supported by techniques like pattern recognition, but for majority of circumstances, a mix of fact tables and bulk data stores, linked by a common referencing system will meet the most significant reporting requirements information managers will meet

Data warehousing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data warehousing

Similar to Data warehousing (20)

More from Allen Woods

More from Allen Woods (9)

Data warehousing