• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Warehousing dimension star-snowflake_schemas
 

Warehousing dimension star-snowflake_schemas

on

  • 1,471 views

High level presentation on the use of dimensions, star and snowflake schemas in data warehousing.

High level presentation on the use of dimensions, star and snowflake schemas in data warehousing.

Statistics

Views

Total Views
1,471
Views on SlideShare
891
Embed Views
580

Actions

Likes
0
Downloads
37
Comments
0

4 Embeds 580

http://www.sqlinfo.net 561
http://sqlinfo.net 13
http://localhost 4
http://translate.googleusercontent.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Warehousing dimension star-snowflake_schemas Warehousing dimension star-snowflake_schemas Presentation Transcript

    • Data Warehousing – Dimensions | Star and Snowflake SchemasEric Matthews - DataWithUs
    • Defining Some Key Terms Dimension • Data Element • Categorizes each item in a data set • Provides Structured Labeling/Tagging • Dimensions can consist of hierarchies. For example: Date | Month, Quarter, Year • Dimension tables contain appropriate foreign keys to join to fact tables. Dimension – Primary Role • Data Filtering • Data Grouping • Data Labeling Fact • Measures, Counted, or aggregate event. For example: Sales, Admissions, Blood Pressure, Inventory can all be construed as “facts” • Fact Tables contain appropriate joining keys
    • Defining Some Key Terms (continued) Conformed Dimension • Common set of data structures/attributes • Can cut across many facts, but… • The row headers in an answer must be able to exactly match, or… • Can be an exact subset These definitions will come into brighter light as we look at some examples.
    • Star Schema • Most atomic form of dimension modeling • Consists of dimension table(s) modeled around a fact table • Optimized for querying large data sets
    • Star Schema Logical Dimension Table PatientDimension Table Demographics Date/Time Fact Table Keys Dimension Table Facts ReferringDimension Table Physician Insurance Carrier
    • Star Schema – Talking Points for Next DiagramNote: Have original table schema as point of reference. • Discuss aggregation from source table to fact table rolling up totals (How this needed to be done). • Discuss the notion of rolling up fact tables to create other fact tables (use account type, financial class, and service code columns in the fact table for basis of discussion) • Discuss some of the pitfalls of dimension tables by using the physician dimension as an example (example: Physicians can change jobs) • Discuss the Date Dimension from the perspective of the data in the table… which transitions us to a key point… …which is similar to how one needs to resolve foreign keys in reporting the dimension table is a table form of the same concept. Additionally, If one has well defined master data then populating the dimension tables can be done using a columnar subset of the source master data table.
    • Fact Table: Acct Fin RollupDimension TableDate Dimension Table ACCT_NUM Patient WEEK ACCT_PTPTR YEAR ACCT_PTPTR ACCT_GUARANTOR_ID PATIENT_NAME QUARTER ACCT_REFERRING_MD MONTH CITY ACCT_START_DATE STATE ACCT_END_DATE ZIP PLAN_SEQ1 ACCT_TYPE Dimension Table FC Insurance Plan/Carrier HOSPITAL_SERVICE_CODE PLAN_SEQ1 PLAN_NAME TOT_TOTAL_CHARGES Dimension Table CARRIER TOT_TOTAL_PAYMENTS Referring Physician CITY TOT_TOTAL_ADJUSTMENTS TOT_BALANCE ACCT_REFERRING_MD STATE PHYSICIAN_NAME ZIP AFFILIATION AFFILIATION_CITY AFFILIATION_STATE AFFILIATION_ZIP
    • Snowflake Schema • Think Star Schema where the dimension tables are normalized • Can be used to segregate rows in dimension tables that have a high percentage of null data (for faster lookup, you cannot index null )
    • Snowflake Schema Fact Table product_key Dimension Table Units product_key Cost Per Unit supplier_key Product Info Dimension Table supplier_key Supplier Info
    • Conformed Dimension A conformed dimension is a set of data attributes that have been physically implemented in multiple tables using the same structure. A conformed dimension can be applied to different fact tables. For example: Dimension Table Patient Demographics (Gender, Age) Fact Table Hypertension StudiesNote: The classic example fora conformed dimension is Fact Tabledate. I wanted to offer adifferent example. Lab Results Fact Table Diabetes Assessment
    • Transition to Next Point of Discussion Star and Snowflake schemas are optimized for querying large data sets. They should support: • OLAP cubes • Business Intelligence and Analytic Applications • Ad hoc queries
    • The End