SlideShare a Scribd company logo
Data Warehousing – Dimensions | Star and
                    Snowflake Schemas




Eric Matthews - DataWithUs
Defining Some Key Terms
 Dimension
    • Data Element
    • Categorizes each item in a data set
    • Provides Structured Labeling/Tagging
    • Dimensions can consist of hierarchies. For example: Date |
      Month, Quarter, Year
    • Dimension tables contain appropriate foreign keys to join
      to fact tables.
 Dimension – Primary Role
    • Data Filtering
    • Data Grouping
    • Data Labeling

 Fact
    • Measures, Counted, or aggregate event. For example:
      Sales, Admissions, Blood Pressure, Inventory can all be
      construed as “facts”
    • Fact Tables contain appropriate joining keys
Defining Some Key Terms (continued)
 Conformed Dimension
    • Common set of data structures/attributes
    • Can cut across many facts, but…
    • The row headers in an answer must be able to exactly
      match, or…
    • Can be an exact subset



 These definitions will come into brighter light as we look at some
 examples.
Star Schema



   • Most atomic form of dimension modeling

   • Consists of dimension table(s) modeled around a fact table

   • Optimized for querying large data sets
Star Schema
                  Logical                Dimension Table
                                          Patient
Dimension Table                           Demographics
 Date/Time

                            Fact Table


                               Keys
                                           Dimension Table
                              Facts          Referring
Dimension Table                              Physician
  Insurance
  Carrier
Star Schema – Talking Points for Next Diagram
Note: Have original table schema as point of reference.


  • Discuss aggregation from source table to fact table rolling
    up totals (How this needed to be done).
  • Discuss the notion of rolling up fact tables to create other
    fact tables (use account type, financial class, and service
    code columns in the fact table for basis of discussion)
  • Discuss some of the pitfalls of dimension tables by using
    the physician dimension as an example (example:
    Physicians can change jobs)
  • Discuss the Date Dimension from the perspective of the
    data in the table… which transitions us to a key point…

  …which is similar to how one needs to resolve foreign keys in
  reporting the dimension table is a table form of the same
  concept.

  Additionally, If one has well defined master data then populating
  the dimension tables can be done using a columnar subset of the
  source master data table.
Fact Table: Acct Fin Rollup
Dimension Table
Date                                                      Dimension Table
                             ACCT_NUM                     Patient
 WEEK                        ACCT_PTPTR
 YEAR                                                       ACCT_PTPTR
                             ACCT_GUARANTOR_ID              PATIENT_NAME
 QUARTER                     ACCT_REFERRING_MD
 MONTH                                                      CITY
                             ACCT_START_DATE                STATE
                             ACCT_END_DATE                  ZIP
                             PLAN_SEQ1
                             ACCT_TYPE
   Dimension Table           FC
   Insurance Plan/Carrier    HOSPITAL_SERVICE_CODE
    PLAN_SEQ1
    PLAN_NAME                TOT_TOTAL_CHARGES
                                                          Dimension Table
    CARRIER                  TOT_TOTAL_PAYMENTS
                                                          Referring Physician
    CITY                     TOT_TOTAL_ADJUSTMENTS
                             TOT_BALANCE                   ACCT_REFERRING_MD
    STATE
                                                           PHYSICIAN_NAME
    ZIP
                                                           AFFILIATION
                                                           AFFILIATION_CITY
                                                           AFFILIATION_STATE
                                                           AFFILIATION_ZIP
Snowflake Schema
    • Think Star Schema where the dimension tables are
      normalized

    • Can be used to segregate rows in dimension tables that
      have a high percentage of null data (for faster lookup, you
      cannot index null )
Snowflake Schema



       Fact Table

    product_key


                    Dimension Table
    Units            product_key
    Cost Per Unit    supplier_key

                      Product Info    Dimension Table
                                       supplier_key

                                        Supplier Info
Conformed Dimension
  A conformed dimension is a set of data attributes that have been
  physically implemented in multiple tables using the same structure. A
  conformed dimension can be applied to different fact tables. For
  example:

 Dimension Table
    Patient
    Demographics
    (Gender, Age)
                                                  Fact Table
                                                     Hypertension
                                                     Studies
Note: The classic example for
a conformed dimension is                          Fact Table
date. I wanted to offer a
different example.                                   Lab Results


                                                  Fact Table
                                                    Diabetes
                                                    Assessment
Transition to Next Point of Discussion

  Star and Snowflake schemas are optimized for
  querying large data sets.

  They should support:
      • OLAP cubes
      • Business Intelligence and Analytic Applications
      • Ad hoc queries
The End

More Related Content

What's hot

Data warehouse
Data warehouseData warehouse
Data warehouse
shachibattar
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
obieefans
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecyclebartlowe
 
Data warehouse 21 snowflake schema
Data warehouse 21 snowflake schemaData warehouse 21 snowflake schema
Data warehouse 21 snowflake schema
Vaibhav Khanna
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
Zalpa Rathod
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Anshika Nigam
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
Sayed Ahmed
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
adivasoft
 
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Medicaps University
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
James Serra
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
jeshocarme
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
Abhi Bhardwaj
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
Amin Choroomi
 

What's hot (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
Data warehouse 21 snowflake schema
Data warehouse 21 snowflake schemaData warehouse 21 snowflake schema
Data warehouse 21 snowflake schema
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
 

Viewers also liked

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
Claudia Gomez
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
Perficient, Inc.
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse Quickly
WhereScape
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schema
Umar Ali
 
Star schema
Star schemaStar schema
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
King Julian
 
Snowflakes for christmas
Snowflakes for christmasSnowflakes for christmas
Snowflakes for christmas
George Arlapanos
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
yesheeka
 
Dw case study
Dw case studyDw case study
Dw case study
Shwetabh Jaiswal
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business change
Patrick Van Renterghem
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideDan D'Angelo
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)
Patrick Van Renterghem
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Patrick Van Renterghem
 
Trends for 2014
Trends for 2014Trends for 2014
Trends for 2014
Patrick Van Renterghem
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup Presentation
Patrick Van Renterghem
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D Watt
Vincent Kwon
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Patrick Van Renterghem
 

Viewers also liked (20)

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse Quickly
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schema
 
Star schema
Star schemaStar schema
Star schema
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Snowflakes for christmas
Snowflakes for christmasSnowflakes for christmas
Snowflakes for christmas
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
Star schema
Star schemaStar schema
Star schema
 
Dw case study
Dw case studyDw case study
Dw case study
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business change
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference Guide
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
 
Trends for 2014
Trends for 2014Trends for 2014
Trends for 2014
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup Presentation
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D Watt
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
 

Similar to Warehousing dimension star-snowflake_schemas

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
MobeenMasoudi
 
Data Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptxData Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptx
Dr. Jasmine Beulah Gnanadurai
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define Grain
Ryan Casey
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Data Warehousing
Data WarehousingData Warehousing
Data modelling interview question
Data modelling interview questionData modelling interview question
Dw concepts
Dw conceptsDw concepts
Dw concepts
Krishna Prasad
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf
MobeenMasoudi
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
Sunita Sahu
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capability
designer DATA
 
Editingglossary
EditingglossaryEditingglossary
Editingglossary
Rubiah69
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact table
chirag patil
 
19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE
Bharath123Maddipati
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdf
Jayanti Pande
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptx
IntisarAhmad5
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform India
Data Portal India
 

Similar to Warehousing dimension star-snowflake_schemas (20)

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
 
Data Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptxData Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptx
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define Grain
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
1234
12341234
1234
 
Dw concepts
Dw conceptsDw concepts
Dw concepts
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capability
 
Editingglossary
EditingglossaryEditingglossary
Editingglossary
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact table
 
19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdf
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptx
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform India
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 

Warehousing dimension star-snowflake_schemas

  • 1. Data Warehousing – Dimensions | Star and Snowflake Schemas Eric Matthews - DataWithUs
  • 2. Defining Some Key Terms Dimension • Data Element • Categorizes each item in a data set • Provides Structured Labeling/Tagging • Dimensions can consist of hierarchies. For example: Date | Month, Quarter, Year • Dimension tables contain appropriate foreign keys to join to fact tables. Dimension – Primary Role • Data Filtering • Data Grouping • Data Labeling Fact • Measures, Counted, or aggregate event. For example: Sales, Admissions, Blood Pressure, Inventory can all be construed as “facts” • Fact Tables contain appropriate joining keys
  • 3. Defining Some Key Terms (continued) Conformed Dimension • Common set of data structures/attributes • Can cut across many facts, but… • The row headers in an answer must be able to exactly match, or… • Can be an exact subset These definitions will come into brighter light as we look at some examples.
  • 4. Star Schema • Most atomic form of dimension modeling • Consists of dimension table(s) modeled around a fact table • Optimized for querying large data sets
  • 5. Star Schema Logical Dimension Table Patient Dimension Table Demographics Date/Time Fact Table Keys Dimension Table Facts Referring Dimension Table Physician Insurance Carrier
  • 6. Star Schema – Talking Points for Next Diagram Note: Have original table schema as point of reference. • Discuss aggregation from source table to fact table rolling up totals (How this needed to be done). • Discuss the notion of rolling up fact tables to create other fact tables (use account type, financial class, and service code columns in the fact table for basis of discussion) • Discuss some of the pitfalls of dimension tables by using the physician dimension as an example (example: Physicians can change jobs) • Discuss the Date Dimension from the perspective of the data in the table… which transitions us to a key point… …which is similar to how one needs to resolve foreign keys in reporting the dimension table is a table form of the same concept. Additionally, If one has well defined master data then populating the dimension tables can be done using a columnar subset of the source master data table.
  • 7. Fact Table: Acct Fin Rollup Dimension Table Date Dimension Table ACCT_NUM Patient WEEK ACCT_PTPTR YEAR ACCT_PTPTR ACCT_GUARANTOR_ID PATIENT_NAME QUARTER ACCT_REFERRING_MD MONTH CITY ACCT_START_DATE STATE ACCT_END_DATE ZIP PLAN_SEQ1 ACCT_TYPE Dimension Table FC Insurance Plan/Carrier HOSPITAL_SERVICE_CODE PLAN_SEQ1 PLAN_NAME TOT_TOTAL_CHARGES Dimension Table CARRIER TOT_TOTAL_PAYMENTS Referring Physician CITY TOT_TOTAL_ADJUSTMENTS TOT_BALANCE ACCT_REFERRING_MD STATE PHYSICIAN_NAME ZIP AFFILIATION AFFILIATION_CITY AFFILIATION_STATE AFFILIATION_ZIP
  • 8. Snowflake Schema • Think Star Schema where the dimension tables are normalized • Can be used to segregate rows in dimension tables that have a high percentage of null data (for faster lookup, you cannot index null )
  • 9. Snowflake Schema Fact Table product_key Dimension Table Units product_key Cost Per Unit supplier_key Product Info Dimension Table supplier_key Supplier Info
  • 10. Conformed Dimension A conformed dimension is a set of data attributes that have been physically implemented in multiple tables using the same structure. A conformed dimension can be applied to different fact tables. For example: Dimension Table Patient Demographics (Gender, Age) Fact Table Hypertension Studies Note: The classic example for a conformed dimension is Fact Table date. I wanted to offer a different example. Lab Results Fact Table Diabetes Assessment
  • 11. Transition to Next Point of Discussion Star and Snowflake schemas are optimized for querying large data sets. They should support: • OLAP cubes • Business Intelligence and Analytic Applications • Ad hoc queries