SlideShare a Scribd company logo
1 of 39
SOFTWARE/WEB/MOBILE/DATABASE ARCHITECT, ENGINEER, AND DEVELOPER
TORONTO, CANADA
HTTP://SAYED.JUSTETC.NET
HTTP://WWW.JUSTETC.NET
Sayed Ahmed
Logical Design of a Data Warehouse
OUR SERVICES
 Free Training and Educational Services
 Training and Education in Bangla:
 Bangla.SaLearningSchool.com
 Training and Education in English:
 www.SaLearningSchool.com
 English.SaLearningSchool.com
 http://sitestree.com
 Ask a question and get answers:
 Ask.JustEtc.net
DESIGNING DIMENSIONS
 Dimension Field/Column Types
 Yes, when designing dimension tables, you need
to define the following types of columns/fields to
facilitate with reporting and analysis
 Keys : Used to identify entities
 Name columns: Used for human names of entities
 Attributes: Used for pivoting in analyses
 Member properties: Used for labels in a report
 Lineage columns: Used for auditing, and never
exposed to end users
DESIGNING DIMENSIONS
 You need to design your dimensions keeping analysis in mind
 Yes, reporting need to be in your mind for sure
 For analysis, we use
 Pivot Table
 Pivot Graph
 For Dimensions
 The fields used as for pivoting are called
 Attributes
 Not all columns in a dimension are attributes
 in OLTP tables, all columns are attributes
 Attributes:
 The fields based on what
 analysis are done
 In previous slide
 you saw the different types of columns in a dimension table
DIMENSION ATTRIBUTES
 Attributes
 For pivoting
 discrete attributes with a small number of distinct
values are the most appropriate
 Attribute values should not be continuous
 Keys are not good candidates for pivoting and
analysis; and so, not great for attributes
 To make continuous column for pivoting
 Convert/utilize it as a small set of discrete values
ON DIMENSION ATTRIBUTES
 SQL Server Analysis Service (SSAS) can
discretize continuous columns to achieve
discrete attributes
 Not always great (the automated process)
 you need to keep business perspectives as well
 Such as, 1 year difference in age can be significant at
young ages
 though may not matter when the age is 60 (depends on the
business perspective as well)
 Considering, we are using age for pivoting
 Age and Income are not good candidates for auto
discretize
NAMING COLUMNS, AND MEMBER PROPERTIES
 Naming columns (another dimension column
type) to identify the entity
 Not good for pivoting or keys
 Such as Address, city, or phones
 Member Properties
 Columns used in reports as labels only, not for
pivoting, are called member properties.
 Can include translations i.e. Naming/member
properties
LINEAGE AND AUDITING
 Lineage and auditing columns
 Used for auditing data
 Never exposed to the users
AUDITING AND LINEAGE
 In data warehouse, you may want some
auditing tables
 For every update, you should audit
 who made the update,
 when it was made,
 and how many rows were transferred
 to each dimension and
 fact table
 in your Data Warehouse
AUDITING AND LINEAGE
 You will need additional fields/columns in
your dimension and fact tables to track
 When, and who, and from where the row data
was/were updated
 Your ETL process needs to be updated
 If you used SSIS for the ETL
 Modify SSIS packages so that you can record these
information
CUSTOMER DIMENSION TABLE (PARTIAL)
Yes, in AdventureWorksDW 2012 database
POSSIBLE ATTRIBUTES FOR CUSTOMER DIMENSION
 Possible Attributes for Customer Dimension
 BirthDate (after calculating age and discretizing the age)
 MaritalStatus
 Gender
 YearlyIncome (after discretizing)
 TotalChildren
 NumberChildrenAtHome
 EnglishEducation (other education columns are for
translations)
 EnglishOccupation (other occupation columns are for
translations)
 HouseOwnerFlag
 NumberCarsOwned
 CommuteDistance
DATE DIMENSION IN ADVENTUREWORKSDW
DATE DIMENSION ATTRIBUTES
 FullDateAlternateKey (denotes a date in date format)
 EnglishMonthName
 CalendarQuarter
 CalendarSemester
 CalendarYear
 Drill Down attributes
 CalendarYear →CalendarSemester → CalendarQuarter
→ EnglishMonthName → FullDateAlternateKey.
 Usually leaf nodes appear in reports – when you can see
a drill down attribute hierarchies
DRILL DOWN HIERARCHIES
 dimension columns used in reports for labels
 are called member properties. – we already know
 In a Snowflake schema
 lookup tables show you levels of hierarchies
 In a Star schema
 you need to extract natural hierarchies from the
names and content of columns.
 Nevertheless, because drilling down through natural
hierarchies is so useful and welcomed by end users,
 you should use them as much as possible.
SLOWLY CHANGING DIMENSIONS
 Related to Auditing to keep track of historical data
 When data changes over time such as
 Someone moves to a different city
 Job title change for someone
 Three approaches to take for the purpose
 Type 1
 History lost
 Type 2
 Keeps all history
 Type 3
 Keeps partial history
 You can use a combination
 For some columns type1 for others type 2
TYPE 1
Information got changed, you just update the information. You lose the previous
information . Example as below:
TYPE 2 SCD
Here you keep track of all changes. In the example below, to keep track of Occupat
You insert new rows and mark the current position with current field.
Sure, you need to come up with ideas so that primary key constraints do not fail
(you can use a second type of keys called surrogate keys)
You can use date from and date to, to keep track of the changes
For the same dimension for some columns you can use Type 1 for others you
can use type 2
MIXED TYPE 1 AND TYPE 2
TYPE 3
Partial history is kept. In the example only the previous city information is kept
THANK YOU FOR BEING WITH US
 That’s the end of Dimension Table Design
 I may come again with a training video on it
 You will see some slides on Fact Table
Design after this slide
 I will make another presentation document on
that topic
OUR SERVICES
 Free Training and Educational Services
 Training and Education in Bangla:
 Bangla.SaLearningSchool.com
 Training and Education in English:
 www.SaLearningSchool.com
 English.SaLearningSchool.com
 http://sitestree.com
 Ask a question and get answers:
 Ask.JustEtc.net
FACT TABLE DESIGN
 Fact Table Design Topics
 Define fact table column types.
 Understand the additivity of a measure.
 Handle many-to-many relationships in a Star
schema.
FACT TABLE COLUMN TYPES
 Fact Table Column Types
 Foreign keys
 Measures
 Lineage columns (optional)
 Business key columns from the primary source
table (optional)
 Surrogate keys
FACT TABLE COLUMNS
 Measure Column Type
 Measure columns help with measurements
useful for a specific business process
 Measures columns are usually numeric
 And can be aggregated
 Measure columns store values that are of
interest to business such as
 sales amount, order quantity, and discount amount
FACT TABLE COLUMNS
 Foreign Key – Column Type
 These are the columns as coming from
Dimension Tables
DESIGNING FACT TABLES
 Fact tables include measures, foreign keys,
and possibly an additional primary key and
lineage columns.
 Measures can be additive, non-additive, or
semi-additive.
 For many-to-many relationships, you can
introduce an additional intermediate
dimension.
 Surrogate Key
 Usually will comes from the primary dimension
table for the current fact table
 Usually one or two columns in a fact table are
surrogate keys

SURROGATE KEYS FOR FACT TABLES
OrderId and LineItemId are the
surrogate keys as coming from the
primary Source Order details table
OrderId and LineItemId columns will help
For quick comparisons with source data
Surrogate keys are not a must in fact tables;
however, they help
Must read:
http://www.kimballgroup.com/2006/07/d
esign-tip-81-fact-table-surrogate-key/
LINEAGE COLUMNS IN FACT TABLES
 Lineage columns –
 Just as with dimension tables, these are strictly
for auditing purposes.
 References:
 https://upsearch.com/implementing-a-data-
warehouse-fact-tables/
ADDITIVITY OF MEASURES
 The primary purpose of Data warehouse is reporting,
and forecasting ( and analysis in some cases)
 Many times reports are aggregations such as sum or
avergae
 Example: sales by quarter, by region, by product type,
 Many reports are usually aggregation
 Hence, fact tables will have some columns to assist
with that measures and aggregation for reporting
 These are the measures columns as we discussed
before
 The measures that you add will help in how you want
to do the measures and reporting
TYPES OF ADDITIVITY OF MEASURES
 Types of Additivity of Measures
 additive measures
 Semi-additive measures
 non-additive measures

 Additive
 If a measure can be summed across all dimensions,
it’s referred to as an additive measure.
 Semi-additive
 Sometimes, however, we can sum a measure across
all dimensions except for time such as account
balance
 We can’t sum the account balance across the time
dimension. We would need to do something like take the
average instead, or simply use the last value. Measures
like this are called semi-additive measures.
 Finally, some measures can’t ever be
summed. These are called non-additive
measures, and include measures like
discount percentages and prices
ADDITIVITY OF MEASURES IN SSAS
 SSAS has support for semi-additive and non-additive
measures
 The SSAS database model is called the Business
Intelligence Semantic Model (BISM). Compared to the
SQL Server database model, BISM includes much
additional metadata.
 SSAS has two types of storage:
 dimensional and tabular.
 Tabular storage is quicker to develop, because it works
through tables like a data warehouse does.
 The dimensional model more properly represents a cube.
 However, the dimensional model includes even more
metadata than the tabular model.
 In BISM dimensional processing, SSAS
offers semi-additive aggregate functions out
of the box.
 For example, SSAS offers the LastNonEmpty
aggregate function, which properly uses the
SUM aggregate function across all
dimensions but time, and defines the last
known value as the aggregate over time.
 In the BISM tabular model, you use the Data
Analysis Expression (DAX) language. The
DAX language includes functions that let you
build semi-additive expressions quite quickly
as well.
 Fact tables
 Collection of measurements on a specific
aspects of business
 Measure columns
 sales amount, order quantity, and discount
amount.
Data ware   dimension design

More Related Content

Similar to Data ware dimension design

Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
Tableau Online Training in canada
Tableau Online Training in canadaTableau Online Training in canada
Tableau Online Training in canadaBoundTechS
 
Funções DAX.pdf
Funções DAX.pdfFunções DAX.pdf
Funções DAX.pdfJoao Vaz
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13AnwarrChaudary
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Ais Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The ReaAis Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The ReaSharing Slides Training
 
Ais Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The ReaAis Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The Reasharing notes123
 
Découverte d'Einstein Analytics (Tableau CRM)
Découverte d'Einstein Analytics (Tableau CRM)Découverte d'Einstein Analytics (Tableau CRM)
Découverte d'Einstein Analytics (Tableau CRM)Doria Hamelryk
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 
Build .NET Applications with Reporting and Dashboard
Build .NET Applications with Reporting and DashboardBuild .NET Applications with Reporting and Dashboard
Build .NET Applications with Reporting and DashboardIron Speed
 
Sprocket Central.pptx
Sprocket Central.pptxSprocket Central.pptx
Sprocket Central.pptxSunoojaSuhra
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfJayanti Pande
 
Building Bi Dashboards With SAS Gauges and SAS BI Portal
Building Bi Dashboards With SAS Gauges and SAS BI PortalBuilding Bi Dashboards With SAS Gauges and SAS BI Portal
Building Bi Dashboards With SAS Gauges and SAS BI Portalsimienc
 

Similar to Data ware dimension design (20)

Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Tableau PPT
Tableau PPTTableau PPT
Tableau PPT
 
Tableau ppt
Tableau pptTableau ppt
Tableau ppt
 
Tableau Online Training in canada
Tableau Online Training in canadaTableau Online Training in canada
Tableau Online Training in canada
 
Funções DAX.pdf
Funções DAX.pdfFunções DAX.pdf
Funções DAX.pdf
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13
 
Getting power bi
Getting power biGetting power bi
Getting power bi
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Ais Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The ReaAis Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The Rea
 
Ais Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The ReaAis Romney 2006 Slides 15 Database Design Using The Rea
Ais Romney 2006 Slides 15 Database Design Using The Rea
 
Découverte d'Einstein Analytics (Tableau CRM)
Découverte d'Einstein Analytics (Tableau CRM)Découverte d'Einstein Analytics (Tableau CRM)
Découverte d'Einstein Analytics (Tableau CRM)
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
Star schema
Star schemaStar schema
Star schema
 
Build .NET Applications with Reporting and Dashboard
Build .NET Applications with Reporting and DashboardBuild .NET Applications with Reporting and Dashboard
Build .NET Applications with Reporting and Dashboard
 
Sprocket Central.pptx
Sprocket Central.pptxSprocket Central.pptx
Sprocket Central.pptx
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdf
 
Tableau training in hyderabad
Tableau training in hyderabadTableau training in hyderabad
Tableau training in hyderabad
 
Building Bi Dashboards With SAS Gauges and SAS BI Portal
Building Bi Dashboards With SAS Gauges and SAS BI PortalBuilding Bi Dashboards With SAS Gauges and SAS BI Portal
Building Bi Dashboards With SAS Gauges and SAS BI Portal
 

More from Sayed Ahmed

Workplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsWorkplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsSayed Ahmed
 
Python py charm anaconda jupyter installation and basic commands
Python py charm anaconda jupyter   installation and basic commandsPython py charm anaconda jupyter   installation and basic commands
Python py charm anaconda jupyter installation and basic commandsSayed Ahmed
 
[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic frameworkSayed Ahmed
 
Sap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSayed Ahmed
 
Will be an introduction to
Will be an introduction toWill be an introduction to
Will be an introduction toSayed Ahmed
 
Whm and cpanel overview hosting control panel overview
Whm and cpanel overview   hosting control panel overviewWhm and cpanel overview   hosting control panel overview
Whm and cpanel overview hosting control panel overviewSayed Ahmed
 
Web application development using zend framework
Web application development using zend frameworkWeb application development using zend framework
Web application development using zend frameworkSayed Ahmed
 
Web design and_html_part_3
Web design and_html_part_3Web design and_html_part_3
Web design and_html_part_3Sayed Ahmed
 
Web design and_html_part_2
Web design and_html_part_2Web design and_html_part_2
Web design and_html_part_2Sayed Ahmed
 
Web design and_html
Web design and_htmlWeb design and_html
Web design and_htmlSayed Ahmed
 
Visual studio ide shortcuts
Visual studio ide shortcutsVisual studio ide shortcuts
Visual studio ide shortcutsSayed Ahmed
 
Unit tests in_symfony
Unit tests in_symfonyUnit tests in_symfony
Unit tests in_symfonySayed Ahmed
 
Telerik this is sayed
Telerik this is sayedTelerik this is sayed
Telerik this is sayedSayed Ahmed
 
System analysis and_design
System analysis and_designSystem analysis and_design
System analysis and_designSayed Ahmed
 
Story telling and_narrative
Story telling and_narrativeStory telling and_narrative
Story telling and_narrativeSayed Ahmed
 

More from Sayed Ahmed (20)

Workplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsWorkplace, Data Analytics, and Ethics
Workplace, Data Analytics, and Ethics
 
Python py charm anaconda jupyter installation and basic commands
Python py charm anaconda jupyter   installation and basic commandsPython py charm anaconda jupyter   installation and basic commands
Python py charm anaconda jupyter installation and basic commands
 
[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework
 
Sap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSap hana-ide-overview-nodev
Sap hana-ide-overview-nodev
 
Invest wisely
Invest wiselyInvest wisely
Invest wisely
 
Will be an introduction to
Will be an introduction toWill be an introduction to
Will be an introduction to
 
Whm and cpanel overview hosting control panel overview
Whm and cpanel overview   hosting control panel overviewWhm and cpanel overview   hosting control panel overview
Whm and cpanel overview hosting control panel overview
 
Web application development using zend framework
Web application development using zend frameworkWeb application development using zend framework
Web application development using zend framework
 
Web design and_html_part_3
Web design and_html_part_3Web design and_html_part_3
Web design and_html_part_3
 
Web design and_html_part_2
Web design and_html_part_2Web design and_html_part_2
Web design and_html_part_2
 
Web design and_html
Web design and_htmlWeb design and_html
Web design and_html
 
Visual studio ide shortcuts
Visual studio ide shortcutsVisual studio ide shortcuts
Visual studio ide shortcuts
 
Virtualization
VirtualizationVirtualization
Virtualization
 
User interfaces
User interfacesUser interfaces
User interfaces
 
Unreal
UnrealUnreal
Unreal
 
Unit tests in_symfony
Unit tests in_symfonyUnit tests in_symfony
Unit tests in_symfony
 
Telerik this is sayed
Telerik this is sayedTelerik this is sayed
Telerik this is sayed
 
System analysis and_design
System analysis and_designSystem analysis and_design
System analysis and_design
 
Symfony 2
Symfony 2Symfony 2
Symfony 2
 
Story telling and_narrative
Story telling and_narrativeStory telling and_narrative
Story telling and_narrative
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Data ware dimension design

  • 1. SOFTWARE/WEB/MOBILE/DATABASE ARCHITECT, ENGINEER, AND DEVELOPER TORONTO, CANADA HTTP://SAYED.JUSTETC.NET HTTP://WWW.JUSTETC.NET Sayed Ahmed Logical Design of a Data Warehouse
  • 2. OUR SERVICES  Free Training and Educational Services  Training and Education in Bangla:  Bangla.SaLearningSchool.com  Training and Education in English:  www.SaLearningSchool.com  English.SaLearningSchool.com  http://sitestree.com  Ask a question and get answers:  Ask.JustEtc.net
  • 3. DESIGNING DIMENSIONS  Dimension Field/Column Types  Yes, when designing dimension tables, you need to define the following types of columns/fields to facilitate with reporting and analysis  Keys : Used to identify entities  Name columns: Used for human names of entities  Attributes: Used for pivoting in analyses  Member properties: Used for labels in a report  Lineage columns: Used for auditing, and never exposed to end users
  • 4. DESIGNING DIMENSIONS  You need to design your dimensions keeping analysis in mind  Yes, reporting need to be in your mind for sure  For analysis, we use  Pivot Table  Pivot Graph  For Dimensions  The fields used as for pivoting are called  Attributes  Not all columns in a dimension are attributes  in OLTP tables, all columns are attributes  Attributes:  The fields based on what  analysis are done  In previous slide  you saw the different types of columns in a dimension table
  • 5. DIMENSION ATTRIBUTES  Attributes  For pivoting  discrete attributes with a small number of distinct values are the most appropriate  Attribute values should not be continuous  Keys are not good candidates for pivoting and analysis; and so, not great for attributes  To make continuous column for pivoting  Convert/utilize it as a small set of discrete values
  • 6. ON DIMENSION ATTRIBUTES  SQL Server Analysis Service (SSAS) can discretize continuous columns to achieve discrete attributes  Not always great (the automated process)  you need to keep business perspectives as well  Such as, 1 year difference in age can be significant at young ages  though may not matter when the age is 60 (depends on the business perspective as well)  Considering, we are using age for pivoting  Age and Income are not good candidates for auto discretize
  • 7. NAMING COLUMNS, AND MEMBER PROPERTIES  Naming columns (another dimension column type) to identify the entity  Not good for pivoting or keys  Such as Address, city, or phones  Member Properties  Columns used in reports as labels only, not for pivoting, are called member properties.  Can include translations i.e. Naming/member properties
  • 8. LINEAGE AND AUDITING  Lineage and auditing columns  Used for auditing data  Never exposed to the users
  • 9. AUDITING AND LINEAGE  In data warehouse, you may want some auditing tables  For every update, you should audit  who made the update,  when it was made,  and how many rows were transferred  to each dimension and  fact table  in your Data Warehouse
  • 10. AUDITING AND LINEAGE  You will need additional fields/columns in your dimension and fact tables to track  When, and who, and from where the row data was/were updated  Your ETL process needs to be updated  If you used SSIS for the ETL  Modify SSIS packages so that you can record these information
  • 11. CUSTOMER DIMENSION TABLE (PARTIAL) Yes, in AdventureWorksDW 2012 database
  • 12. POSSIBLE ATTRIBUTES FOR CUSTOMER DIMENSION  Possible Attributes for Customer Dimension  BirthDate (after calculating age and discretizing the age)  MaritalStatus  Gender  YearlyIncome (after discretizing)  TotalChildren  NumberChildrenAtHome  EnglishEducation (other education columns are for translations)  EnglishOccupation (other occupation columns are for translations)  HouseOwnerFlag  NumberCarsOwned  CommuteDistance
  • 13. DATE DIMENSION IN ADVENTUREWORKSDW
  • 14. DATE DIMENSION ATTRIBUTES  FullDateAlternateKey (denotes a date in date format)  EnglishMonthName  CalendarQuarter  CalendarSemester  CalendarYear  Drill Down attributes  CalendarYear →CalendarSemester → CalendarQuarter → EnglishMonthName → FullDateAlternateKey.  Usually leaf nodes appear in reports – when you can see a drill down attribute hierarchies
  • 15. DRILL DOWN HIERARCHIES  dimension columns used in reports for labels  are called member properties. – we already know  In a Snowflake schema  lookup tables show you levels of hierarchies  In a Star schema  you need to extract natural hierarchies from the names and content of columns.  Nevertheless, because drilling down through natural hierarchies is so useful and welcomed by end users,  you should use them as much as possible.
  • 16. SLOWLY CHANGING DIMENSIONS  Related to Auditing to keep track of historical data  When data changes over time such as  Someone moves to a different city  Job title change for someone  Three approaches to take for the purpose  Type 1  History lost  Type 2  Keeps all history  Type 3  Keeps partial history  You can use a combination  For some columns type1 for others type 2
  • 17. TYPE 1 Information got changed, you just update the information. You lose the previous information . Example as below:
  • 18. TYPE 2 SCD Here you keep track of all changes. In the example below, to keep track of Occupat You insert new rows and mark the current position with current field. Sure, you need to come up with ideas so that primary key constraints do not fail (you can use a second type of keys called surrogate keys) You can use date from and date to, to keep track of the changes For the same dimension for some columns you can use Type 1 for others you can use type 2
  • 19. MIXED TYPE 1 AND TYPE 2
  • 20. TYPE 3 Partial history is kept. In the example only the previous city information is kept
  • 21. THANK YOU FOR BEING WITH US  That’s the end of Dimension Table Design  I may come again with a training video on it  You will see some slides on Fact Table Design after this slide  I will make another presentation document on that topic
  • 22. OUR SERVICES  Free Training and Educational Services  Training and Education in Bangla:  Bangla.SaLearningSchool.com  Training and Education in English:  www.SaLearningSchool.com  English.SaLearningSchool.com  http://sitestree.com  Ask a question and get answers:  Ask.JustEtc.net
  • 23. FACT TABLE DESIGN  Fact Table Design Topics  Define fact table column types.  Understand the additivity of a measure.  Handle many-to-many relationships in a Star schema.
  • 24. FACT TABLE COLUMN TYPES  Fact Table Column Types  Foreign keys  Measures  Lineage columns (optional)  Business key columns from the primary source table (optional)  Surrogate keys
  • 25. FACT TABLE COLUMNS  Measure Column Type  Measure columns help with measurements useful for a specific business process  Measures columns are usually numeric  And can be aggregated  Measure columns store values that are of interest to business such as  sales amount, order quantity, and discount amount
  • 26. FACT TABLE COLUMNS  Foreign Key – Column Type  These are the columns as coming from Dimension Tables
  • 27. DESIGNING FACT TABLES  Fact tables include measures, foreign keys, and possibly an additional primary key and lineage columns.  Measures can be additive, non-additive, or semi-additive.  For many-to-many relationships, you can introduce an additional intermediate dimension.
  • 28.  Surrogate Key  Usually will comes from the primary dimension table for the current fact table  Usually one or two columns in a fact table are surrogate keys 
  • 29. SURROGATE KEYS FOR FACT TABLES OrderId and LineItemId are the surrogate keys as coming from the primary Source Order details table OrderId and LineItemId columns will help For quick comparisons with source data Surrogate keys are not a must in fact tables; however, they help Must read: http://www.kimballgroup.com/2006/07/d esign-tip-81-fact-table-surrogate-key/
  • 30. LINEAGE COLUMNS IN FACT TABLES  Lineage columns –  Just as with dimension tables, these are strictly for auditing purposes.  References:  https://upsearch.com/implementing-a-data- warehouse-fact-tables/
  • 31. ADDITIVITY OF MEASURES  The primary purpose of Data warehouse is reporting, and forecasting ( and analysis in some cases)  Many times reports are aggregations such as sum or avergae  Example: sales by quarter, by region, by product type,  Many reports are usually aggregation  Hence, fact tables will have some columns to assist with that measures and aggregation for reporting  These are the measures columns as we discussed before  The measures that you add will help in how you want to do the measures and reporting
  • 32. TYPES OF ADDITIVITY OF MEASURES  Types of Additivity of Measures  additive measures  Semi-additive measures  non-additive measures 
  • 33.  Additive  If a measure can be summed across all dimensions, it’s referred to as an additive measure.  Semi-additive  Sometimes, however, we can sum a measure across all dimensions except for time such as account balance  We can’t sum the account balance across the time dimension. We would need to do something like take the average instead, or simply use the last value. Measures like this are called semi-additive measures.
  • 34.  Finally, some measures can’t ever be summed. These are called non-additive measures, and include measures like discount percentages and prices
  • 35. ADDITIVITY OF MEASURES IN SSAS  SSAS has support for semi-additive and non-additive measures  The SSAS database model is called the Business Intelligence Semantic Model (BISM). Compared to the SQL Server database model, BISM includes much additional metadata.  SSAS has two types of storage:  dimensional and tabular.  Tabular storage is quicker to develop, because it works through tables like a data warehouse does.  The dimensional model more properly represents a cube.  However, the dimensional model includes even more metadata than the tabular model.
  • 36.  In BISM dimensional processing, SSAS offers semi-additive aggregate functions out of the box.  For example, SSAS offers the LastNonEmpty aggregate function, which properly uses the SUM aggregate function across all dimensions but time, and defines the last known value as the aggregate over time.
  • 37.  In the BISM tabular model, you use the Data Analysis Expression (DAX) language. The DAX language includes functions that let you build semi-additive expressions quite quickly as well.
  • 38.  Fact tables  Collection of measurements on a specific aspects of business  Measure columns  sales amount, order quantity, and discount amount.