Your SlideShare is downloading. ×
Designing and developing  Business Process dimensional Model  or Data Warehouse
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Designing and developing Business Process dimensional Model or Data Warehouse

799

Published on

Designing and developing …

Designing and developing
Business Process dimensional
Model or Data Warehouse

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
799
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
43
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • The dimensions reflect the business processes (functional structure) and measures reflect numeric data flow , A dimensional model is made up a central fact table (or tables) and its associated dimensions. The dimensional model is also called a star schema because it looks like a star with the fact table in the middle and the dimensions serving as the points on the star. From a relational data modeling perspective, the dimensional model consists of a normalized fact table with denormalized dimension tables.
  • Think about dimensions as tables in a database because that's how it implements. Each table contains a list of homogeneous entities—products in a manufacturing company, patients in a hospital, vehicles on auto insurance policies, or customers in just about every organization. Usually, a dimension Includes all instances of its entity—all the products the company sells, for example. There is only one active row for each particular instance in the table at any time, and each row has a set of attributes that identify, describe, define, and classify the instance. A product will have a certain size and a standard weight, and belong to a product group. These sizes and groups have descrip­tions, like a food product might come in Mini-Pak or Jumbo size. A vehicle is painted a certain color, like white, and has a certain option package, such as the Jungle Jim sports utility package (which includes side impact air bags, six-disc CD player, DVD system, and simulated leopard skin seats).
  • Most facts are numeric and each fact value can vary widely depending on the business process being measured. Most facts are additive (such as dollar or unit sales), meaning they can be summed up across all dimensions. Additivity is important because DW/BI applications seldom retrieve a single fact table record. User queries generally select hundreds or thousands of records at a time and add them up. Other facts are semi-additive (such as market share or account balance), and still others are non-additive (such as unit price).Not all numeric data are facts. Exceptions include discrete descriptive infor­mation like package size or weight (describes a product) or customer age (describes a customer). Generally, these less volatile numeric values end up as descriptive attributes in dimension tables. Such descriptive information is more naturally used for constraining a query, rather than being summed in a computation. This distinction is helpful when deciding whether a data ele­ment is part of a dimension or fact.Some business processes track events without any real measures. If the event happens, we get an entry in the source system; if not, there is no row. Common examples of this kind of event include employment activities, such as hiring and firing, and event attendance, such as when a student attends a class. The fact tables that track these events typically do not have any actual fact measurements, so they're called factlessfact tables. Actually, we usually add a column called something like EventCount that contains the number 1. This provides users with an easy way to count the number of events by summing the EventCount fact.Some facts are derived or computed from other facts, just as a Net Sale num¬ber is calculated from Gross Sales minus Sales Tax. Some semi-additive facts can be handled using a derived column that is based on the context of the query. Month End Balance would add up across accounts, but not across date, for example. The non-additive Unit Price example could be avoided by defin¬ing it as a computation done in the query, which is Total Amount divided by Total Quantity. There are several options for dealing with these derived or computed facts. You can calculate them as part of the ETL process and store them in the fact table, you can put them in the fact table view definition, or you can include them in the definition of the Analysis Services database. The only way we find unacceptable is to leave the calculation to the user.
  • A surrogate key is a unique value, usually an integer, assigned to each row in the dimension. This surrogate key becomes the primary key of the dimension table and is used to join the dimension to the associated foreign key field in the fact table. Surrogate keys protect the DW/BI system from changes in the source system. Surrogate keys allow the DW/BI system to integrate data from multiple source systems. Different source systems might keep data on the same customers or products, but with different keys. Surrogate keys enable you to add rows to dimensions that do not exist in the source system. Surrogate keys provide the means for tracking changes in dimension attributes over time.
  • A surrogate key is a unique value, usually an integer, assigned to each row in the dimension. This surrogate key becomes the primary key of the dimension table and is used to join the dimension to the associated foreign key field in the fact table. Surrogate keys protect the DW/BI system from changes in the source system. Surrogate keys allow the DW/BI system to integrate data from multiple source systems. Different source systems might keep data on the same customers or products, but with different keys. Surrogate keys enable you to add rows to dimensions that do not exist in the source system. Surrogate keys provide the means for tracking changes in dimension attributes over time.
  • It a Dimensions than have changeable attribute values (SCD).There is three types of SCD:Type 1 SCD overwrites the existing attribute value with the new value.The Type 1 change does not preserve the attribute value that was in place at the time a historical transaction occurred. Type 2 change tracking is a powerful technique for capturing the attribute values that were in effect at a point in time and relating them to the business events in which they participated. When a change to a Type 2 attribute occurs, the ETL process creates a new row in the dimension table to capture the new values of the changed item. Type 3, keeps separate columns for both the old and new attribute, Type 3 is less common because it involves changing the physical tables and is not very scalable.
  • It a Dimensions than have changeable attribute values (SCD).There is three types of SCD:Type 1 SCD overwrites the existing attribute value with the new value.The Type 1 change does not preserve the attribute value that was in place at the time a historical transaction occurred. Type 2 change tracking is a powerful technique for capturing the attribute values that were in effect at a point in time and relating them to the business events in which they participated. When a change to a Type 2 attribute occurs, the ETL process creates a new row in the dimension table to capture the new values of the changed item. Type 3, keeps separate columns for both the old and new attribute, Type 3 is less common because it involves changing the physical tables and is not very scalable.
  • It a Dimensions than have changeable attribute values (SCD).There is three types of SCD:Type 1 SCD overwrites the existing attribute value with the new value.The Type 1 change does not preserve the attribute value that was in place at the time a historical transaction occurred. Type 2 change tracking is a powerful technique for capturing the attribute values that were in effect at a point in time and relating them to the business events in which they participated. When a change to a Type 2 attribute occurs, the ETL process creates a new row in the dimension table to capture the new values of the changed item. Type 3, keeps separate columns for both the old and new attribute, Type 3 is less common because it involves changing the physical tables and is not very scalable.
  • It a Dimensions than have changeable attribute values (SCD).There is three types of SCD:Type 1 SCD overwrites the existing attribute value with the new value.The Type 1 change does not preserve the attribute value that was in place at the time a historical transaction occurred. Type 2 change tracking is a powerful technique for capturing the attribute values that were in effect at a point in time and relating them to the business events in which they participated. When a change to a Type 2 attribute occurs, the ETL process creates a new row in the dimension table to capture the new values of the changed item. Type 3, keeps separate columns for both the old and new attribute, Type 3 is less common because it involves changing the physical tables and is not very scalable.
  • Transcript

    • 1. Designing and developing Business Process dimensional Model or Data Warehouse
    • 2. About Me Slava Kokaev – Lead Business Intelligence Architect at Industrial Defender Boston BI USER GROUP leader email: vkokaev@boston bi.org web:
    • 3. Business Process Dimensional Model or “Star Schema” Database
    • 4. Dimensions
    • 5. Fact Tables
    • 6. Surrogate Keys Using a surrogate key is considered best practice
    • 7. Surrogate Keys Implementation MS-1981 163MS-1981 Surrogate Key Business Key
    • 8. Best Practices
    • 9. Snowflaking
    • 10. Reviewing Star Schema Benefits
    • 11. OLTP vs. OLAP
    • 12. Slowly Changing Dimensions Support primary role of data warehouse to describe the past accurately Maintain historical context as new or changed data is loaded into dimension tables Slowly Changing Dimension (SCD) types Type 1: Overwrite the existing dimension record Type 2: Insert a new ‘versioned’ dimension record Type 3: Track limited history with attributes The concept of Slowly Changing Dimensions was introduced by Ralph Kimball
    • 13. Slowly Changing Dimensions Type 1 LastName update to Valdez-Smythe
    • 14. SalesTerritoryKey update to 10
    • 15. Slowly Changing Dimensions Type 2 SalesTerritoryKey update to 10
    • 16. Resources

    ×