SlideShare a Scribd company logo
SOFTWARE/WEB/MOBILE/DATABASE ARCHITECT, ENGINEER, AND DEVELOPER
TORONTO, CANADA
HTTP://SAYED.JUSTETC.NET
HTTP://WWW.JUSTETC.NET
Sayed Ahmed
Logical Design of a Data Warehouse
OUR SERVICES
 Free Training and Educational Services
 Training and Education in Bangla:
 Bangla.SaLearningSchool.com
 Training and Education in English:
 www.SaLearningSchool.com
 English.SaLearningSchool.com
 Ask a question and get answers:
 Ask.JustEtc.net
TOPICS - KEYWORDS
 Design a Data Warehouse
 Star Schema
 Snow Flake Schema
 Dimension Tables
 Fact Tables
 Auditing
 Surrogate Keys
 Type 1, Type 2, Type 3, and Mixed solutions for
slowly changing dimension data ( SCD
management)
 Pivoting for Analysis
 To help with SSAS on data warehouse
TOPICS - KEYWORDS
 Design a Data Warehouse
 Additive measures
 Semi additive measures
 Hierarchies for dimensions
 Attributes in dimensions
 Attributes in lookup tables
 Long term data warehouse design
 Usually Star Schema
 Short term data warehouse design
 POC
 Usually snowflake schema
TOPICS - KEYWORDS
 Fact Tables
 measures
 foreign keys
 and possibly an additional primary key
 and lineage columns
 granularity of fact tables
 auditing and lineage needs
 Measures can be
 additive
 non-additive
 semi-additive
TOPICS - KEYWORDS
 dimension
 keys
 names
 attributes
 member properties
 translations
 and lineage
TOPICS - KEYWORDS
 attributes
 natural hierarchies
 many-to-many fact table relationships
 you can introduce an additional intermediate
dimension
CONCLUSION
 Not much – right
 However, if you understand all the terms and
can implement all these concepts in your data
warehouse
 That will be great
 Not necessarily you will need to use all of these
concepts; however, you may need to justify based
on the situation, will all or any of these will help?
 What will help and what will not help
 Check our sub sequent videos and tutorials
THANK YOU
 Any Concerns?
 http://ask.justetc.net
 Or comment below...
TOOLS AND SOFTWARE REQUIREMENTS
 Download the Adventure Works databases
 OLTP database (LOB database)
 Data warehouse Database
 From
 http://msftdbprodsamples.codeplex.com/releases/view/55330
 For this tutorial, you can just check our slides
 Though the following tools will help
 And probably check the details in the downloaded
databases esp. The AdventureWorksDW2012
 You will need help from SQL Server and SQL Server
MGMT Studio Tools
REQUIRED TOOLS
 Useful/Required SQL Server Components
 Database Engine Services
 Documentation Components
 Management Tools - Basic
 Management Tools – Complete
 SQL Server Data Tools
DATA WAREHOUSE DESIGN – THE DETAILS
 Data Warehouse Logical Design
 Topics: Design and Implement a Data Warehouse
 Design and implement dimensions.
 Design and implement fact tables
 Design Auditing
 track the source and time for data coming into a DW through
auditing i.e lineage information
 Why a Data Warehouse?
 It is hard to
 generate reports from OLTP/LOB/Transactional database
 To do Analysis on OLTP database data (some times)
 Get useful information/useful summarized and details data
to be used to take business decisions
DATA WAREHOUSE DESIGN – THE DETAILS
 Why a Data Warehouse?
 Data in OLTP are heavily normalized. The goal was
to keep one data only in one single place to reduce
redundancy and consistency of data
 You may end up with many tables 100s, 1000s
 To generate reports you may need to join many
tables – will be slow
 Historical data may not be there
 Data quality is also an issue
 For reporting or analyzing, you may need data from
multiple databases across many departments
WHY A DATA WAREHOUSE?
 So you can create a Data Warehouse
 By cleaning data
 With historical data
 Combining data from multiple sources
 Denormalizing data
 Using specific design geared towards Data
Warehouse design
 Some or many consider DW design is less complex than
relational database design
 Though it also has some complex areas to address... (by those
some or many)
SO WHAT DOES A DATA WAREHOUSE CONTAIN?
 Usually two schemas are used for a DW
 Star Schema-> looks like a star
 Snow Flake Schema
 Another one called Dimensional Model
 Includes both Star and Snow Flake in the same
Data Warehouse
 Both Schemas has tables of two types
 Dimension Tables
 Fact Tables
SO WHAT DOES A DATA WAREHOUSE CONTAIN?
 Fact Tables are in the center
 A Fact table joins/combines all the data required for
this reporting or for the business aspect of this
reporting
 Usually combines the primary keys of different tables that
contain data for this report/business aspect
 Dimension tables are all the other tables that
contain actual data
 Dimension tables are the tables that contain data
 these can be the actual tables in the OLTP database
without any modification (Snow Flake)
 Or Dimension tables can be newly created by
denormalizing the existing OLTP databases (Star)
SO WHAT DOES A DATA WAREHOUSE CONTAIN?
 So, you know now what are dimension tables
and what are fact tables
 Fact tables contain primary keys of all related tables
(here they are foreign keys)
 Dimension tables contain data
 Usually, it’s better that you keep your data
warehouse separate from your OLTP database
 So bring all the tables (dimension) here
 Or denormalize them and bring them here in the new
database
SIMPLIFIED: WHAT ARE STAR AND SNOWFLAKE SCHEMAS
 If you just create Fact tables and take all the
related tables from your OLTP/LOB databases
 You get a Snow Flake Schema
 Here all Dimension tables are still normalized (as
you just took them from the actual database)
 This is easy –
 so good for short-term, quick, and experimental Data
Warehouse
 One note, your reporting and analysis services
queries (MDX, DMV) will be slow with Snow Flake
Schemas
SO WHAT DOES A DATA WAREHOUSE CONTAIN?
 Now, when you denormalize the dimension
tables
 You get the start schema
 The Fact tables remain the same for example
 Star Schema is kind of standard and used a
lot
 Originally was developed in 1980’s
EXAMPLES: WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA
Sales amount for internet sales by different countries and historical years
WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA
 issues that I did not mention before
 If your OLTP database was well designed (?)
 It may be hard to find the tables related to the
reporting
 The table names and the column names can be tricky
– do not follow any conventions – do not have
meaning
 So it can be hard to find data for the reporting
WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA
 Note: Reality:
 The OLTP may not even be well designed (that makes
reporting hard sometimes) even the relationships as well
as normalization
 – here we assumed that OLTP is perfect
 In a long back project
 I had to re-write/verify/check/change/optimize/had to deal with
(whatever you say) 100s (not really 100s, can be close to 100) of
queries for a reporting system
 Had to change the interface from one button for one report
(easy to get lost)
 Into a drop down list of reports
 The relations among data were arbitrary – actually had only in the
mind of the designer – did not follow any standards – No ER – no
standard concepts---
 So it was a hard job..
 Anyway..
WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA
 In such cases
 Tools such as SQL Profiler might help
 you could create a test environment,
 try to insert some data through an LOB application
 have SQL Profiler identify where the data was inserted
 Another, issue with this particular example
 No lookup for dates and years
 You need to extract
 The tables may not contain even historical data
 No date field
 So no historical data
WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA
 If sales data reside in multiple databases even by
multiple departments
 How do you merge
 Identify and match
 Customer data can be in different database with no
common identification
 Data quality can be low
 Data missing
 Partial data
 Inconsistent data in multiple databases
 Data can be represented differenlt in different database
 M or F for gender
 1 or 0 for gender
STAR SCHEMA/FACT/DIMENSION/CUBE
TOTAL DW: MULTIPLE STAR SCHEMAS
 You saw one Star Schema for Internet Sales
 You can see another for Offline Sales
 Another for Accounting
 Your DW has many such Star Schemas
 And these start schemas need to be connected/related
 They will be connected when you use the same
dimensions for them
 i.e. If two star schemas have the same dimension they can
share that dimension
 Called: shared or conformed dimensions
 For SSAS, you can use shared dimensions only
 There is a concept of private dimension
 Not a great idea in practical and real life applications
 You cannot connect/compare/verify the data over the shared dimension
SHARED/CONFORMED DIMENSIONS
DENORMALIZED DIMDATE TABLE
SNOW FLAKES WILL BE MORE AND MORE NORMALIZED
 Everything can be normalized
 Or the first level can be normalized others
are not
NORMALIZED PRODUCT DIMENSION
In the Star Schema, you could use these normalized product table to get snow
flake schema (partially.) Could use all normalized dimensions to get full snow
flake
SNOW FLAKE
 In Snow flake, you may see partial than full
snow flakes in reality
 Though, in reality, better to go for star
schema
 Queries will be faster
PARTIAL SNOW FLAKE
GRANULARITY
 The number of Dimension Tables connected
to a fact table
 Dimension of a star schema
 Cube = 3 dimension
 SSAS operates/analyzes on Cube
AUDITING AND LINEAGE
 I will be very short on this
 In data warehouse, you may want some
auditing tables
 For every update, you should audit
 who made the update,
 when it was made,
 and how many rows were transferred
 to each dimension and
 fact table
 in your DW
AUDITING AND LINEAGE
 You will need additional fields/columns in
your dimension and fact tables to track
 When, and who, and from where the row data
was/were updated
 Your ETL process needs to be updated
 If you used SSIS for the ETL
 Modify SSIS packages so that you can record these
information
THANK YOU
 Any Concerns?
 http://ask.justetc.net
 Or comment below...
DESIGNING DIMENSIONS
 Keys . Used to identify entities
 Name columns . Used for human names of
entities
 Attributes . Used for pivoting in analyses
 Member properties . Used for labels in a
report
 Lineage columns . Used for auditing, and
never exposed to end users
 For analysis
 Pivot Table
 Pivot Graph
 For Dimensions
 The fields used as for pivoting are called
 Attributes
 Not all columns are attributes
 Attributes: based on what analysis are done
 In previous, slide you saw the different types of
columns
 Attributes
 For pivoting, discrete attributes with a small
number of distinct values is most appropriate
 Should not be continuous
 Keys are not good candidates for pivoting and
analysis
 To make continous column for pivoting
 Concert/utilize it as a small set of discrete values
 SSAS can discretize continuous attributes.
 Not always great – need business perspecyive as
well
 Age and Income are not good candidates for auto
discretize
 Naming columns to identify the entity
 Not good for pivoting or keys
 Address such as
 Columns used in reports as labels only, not for
pivoting, are called member properties.
 Can include translations
 Lineage and auditing columns
 Used for auditing data
 Never exposed to the users
 Possible Attributes
 BirthDate (after calculating age and discretizing the age)
 MaritalStatus
 Gender
 YearlyIncome (after discretizing)
 TotalChildren
 NumberChildrenAtHome
 EnglishEducation (other education columns are for translations)
 EnglishOccupation (other occupation columns are for
translations)
 HouseOwnerFlag
 NumberCarsOwned
 CommuteDistance
 FullDateAlternateKey (denotes a date in date
format)
 EnglishMonthName
 CalendarQuarter
 CalendarSemester
 CalendarYear
 Drill Down attributes
 CalendarYear →CalendarSemester → CalendarQu
arter → EnglishMonthName → FullDateAlternateKey
.
 why dimension columns used in reports for
labels are called member properties.
 In a Snowflake schema, lookup tables show you
levels of hierarchies. In a Star schema, you
need to extract natural hierarchies from the
names and content of columns. Nevertheless,
because drilling down through natural
hierarchies is so useful and welcomed by end
users, you should use them as much as
possible.
SLOWLY CHANGING DIMENSIONS
 Type 1
 History lost
 Type 2
 Keeps all history
 Type 3
 Keeps partial history
 You can use a combination
 For some columns type1 for others type 2
DESIGNING FACT TABLES
 Fact tables include measures, foreign keys,
and possibly an additional primary key and
lineage columns.
 Measures can be additive, non-additive, or
semi-additive.
 For many-to-many relationships, you can
introduce an additional intermediate
dimension.
 Fact tables
 Collection of measurements on a specific
aspects of business
 Measure columns
 sales amount, order quantity, and discount
amount.
Data ware house design

More Related Content

What's hot

Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its conceptsGaurav Garg
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse designines beltaief
 
Olap and metadata
Olap and metadata Olap and metadata
Olap and metadata Punk Milton
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouseSrinivasan R
 
Business analysis in data warehousing
Business analysis in data warehousingBusiness analysis in data warehousing
Business analysis in data warehousingHimanshu
 
Using SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesUsing SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesCode Mastery
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best PracticesEduardo Castro
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design phanleson
 
02. Data Warehouse and OLAP
02. Data Warehouse and OLAP02. Data Warehouse and OLAP
02. Data Warehouse and OLAPAchmad Solichin
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data WarehouseZalpa Rathod
 
An overview of data warehousing and OLAP technology
An overview of  data warehousing and OLAP technology An overview of  data warehousing and OLAP technology
An overview of data warehousing and OLAP technology Nikhatfatima16
 
Introducing to Datamining vs. OLAP - مقدمه و مقایسه ای بر داده کاوی و تحلیل ...
Introducing to Datamining vs. OLAP -  مقدمه و مقایسه ای بر داده کاوی و تحلیل ...Introducing to Datamining vs. OLAP -  مقدمه و مقایسه ای بر داده کاوی و تحلیل ...
Introducing to Datamining vs. OLAP - مقدمه و مقایسه ای بر داده کاوی و تحلیل ...y-asgari
 
Steps To Build A Datawarehouse
Steps To Build A DatawarehouseSteps To Build A Datawarehouse
Steps To Build A DatawarehouseHendra Saputra
 

What's hot (20)

Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Datawarehouse olap olam
Datawarehouse olap olamDatawarehouse olap olam
Datawarehouse olap olam
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Olap and metadata
Olap and metadata Olap and metadata
Olap and metadata
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouse
 
Business analysis in data warehousing
Business analysis in data warehousingBusiness analysis in data warehousing
Business analysis in data warehousing
 
Using SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesUsing SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS Cubes
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best Practices
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
02. Data Warehouse and OLAP
02. Data Warehouse and OLAP02. Data Warehouse and OLAP
02. Data Warehouse and OLAP
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
An overview of data warehousing and OLAP technology
An overview of  data warehousing and OLAP technology An overview of  data warehousing and OLAP technology
An overview of data warehousing and OLAP technology
 
Introducing to Datamining vs. OLAP - مقدمه و مقایسه ای بر داده کاوی و تحلیل ...
Introducing to Datamining vs. OLAP -  مقدمه و مقایسه ای بر داده کاوی و تحلیل ...Introducing to Datamining vs. OLAP -  مقدمه و مقایسه ای بر داده کاوی و تحلیل ...
Introducing to Datamining vs. OLAP - مقدمه و مقایسه ای بر داده کاوی و تحلیل ...
 
Steps To Build A Datawarehouse
Steps To Build A DatawarehouseSteps To Build A Datawarehouse
Steps To Build A Datawarehouse
 

Viewers also liked

Key Information Sets Data
Key Information Sets DataKey Information Sets Data
Key Information Sets DataIWMW
 
Beyond WCAG: Implementing BS8878
Beyond WCAG: Implementing BS8878Beyond WCAG: Implementing BS8878
Beyond WCAG: Implementing BS8878IWMW
 
Threat predictions 2011
Threat predictions 2011 Threat predictions 2011
Threat predictions 2011 Trend Micro
 
Encryption in the Public Cloud: 16 Bits of Advice for Security Techniques
Encryption in the Public Cloud: 16 Bits of Advice for Security TechniquesEncryption in the Public Cloud: 16 Bits of Advice for Security Techniques
Encryption in the Public Cloud: 16 Bits of Advice for Security TechniquesTrend Micro
 
Home Delivery Scheme of Foodgrains
Home Delivery Scheme of FoodgrainsHome Delivery Scheme of Foodgrains
Home Delivery Scheme of FoodgrainsSheetal Kachare
 
data information and management system user prespective
data information and management system user prespectivedata information and management system user prespective
data information and management system user prespectiveVinay Shankar
 
Set data structure
Set data structure Set data structure
Set data structure Tech_MX
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architectureDeepak Chaurasia
 
Stores, warehouse and godown
Stores, warehouse and godownStores, warehouse and godown
Stores, warehouse and godownPramod Wadikar
 
Computer Security and Safety, Ethics & Privacy
Computer Security and Safety, Ethics & PrivacyComputer Security and Safety, Ethics & Privacy
Computer Security and Safety, Ethics & PrivacySamudin Kassan
 
Data flow in a computer
Data flow in a computerData flow in a computer
Data flow in a computerOriginalGSM
 
Master Data in Material Management
Master Data in Material ManagementMaster Data in Material Management
Master Data in Material ManagementGanesh Padala
 
Computer technology in library and information science notes in hindi
Computer technology in library and information science notes in hindiComputer technology in library and information science notes in hindi
Computer technology in library and information science notes in hindiMohammad Rehan
 

Viewers also liked (20)

Key Information Sets Data
Key Information Sets DataKey Information Sets Data
Key Information Sets Data
 
Beyond WCAG: Implementing BS8878
Beyond WCAG: Implementing BS8878Beyond WCAG: Implementing BS8878
Beyond WCAG: Implementing BS8878
 
Threat predictions 2011
Threat predictions 2011 Threat predictions 2011
Threat predictions 2011
 
Encryption in the Public Cloud: 16 Bits of Advice for Security Techniques
Encryption in the Public Cloud: 16 Bits of Advice for Security TechniquesEncryption in the Public Cloud: 16 Bits of Advice for Security Techniques
Encryption in the Public Cloud: 16 Bits of Advice for Security Techniques
 
Home Delivery Scheme of Foodgrains
Home Delivery Scheme of FoodgrainsHome Delivery Scheme of Foodgrains
Home Delivery Scheme of Foodgrains
 
Godown management
Godown managementGodown management
Godown management
 
data information and management system user prespective
data information and management system user prespectivedata information and management system user prespective
data information and management system user prespective
 
Team building
Team buildingTeam building
Team building
 
Hardware Trends – Back to Fundamentals
Hardware Trends – Back to FundamentalsHardware Trends – Back to Fundamentals
Hardware Trends – Back to Fundamentals
 
Set data structure
Set data structure Set data structure
Set data structure
 
1 hardware fundamentals
1 hardware fundamentals1 hardware fundamentals
1 hardware fundamentals
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
 
Stores, warehouse and godown
Stores, warehouse and godownStores, warehouse and godown
Stores, warehouse and godown
 
Computer Security and Safety, Ethics & Privacy
Computer Security and Safety, Ethics & PrivacyComputer Security and Safety, Ethics & Privacy
Computer Security and Safety, Ethics & Privacy
 
Data flow in a computer
Data flow in a computerData flow in a computer
Data flow in a computer
 
Master Data in Material Management
Master Data in Material ManagementMaster Data in Material Management
Master Data in Material Management
 
Computer technology in library and information science notes in hindi
Computer technology in library and information science notes in hindiComputer technology in library and information science notes in hindi
Computer technology in library and information science notes in hindi
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Comp 107chp 1
Comp 107chp 1Comp 107chp 1
Comp 107chp 1
 

Similar to Data ware house design

The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designCalpont
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL ServerPeter Gfader
 
Whats a datawarehouse
Whats a datawarehouseWhats a datawarehouse
Whats a datawarehousevijjudarling
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!Amanda Lam
 
Sql Server 2005 Business Inteligence
Sql Server 2005 Business InteligenceSql Server 2005 Business Inteligence
Sql Server 2005 Business Inteligenceabercius24
 
Fact table design for data ware house
Fact table design for data ware houseFact table design for data ware house
Fact table design for data ware houseSayed Ahmed
 
Fact table design for data ware house
Fact table design for data ware houseFact table design for data ware house
Fact table design for data ware houseSayed Ahmed
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms Andrew Brust
 
Schemas for multidimensional databases
Schemas for multidimensional databasesSchemas for multidimensional databases
Schemas for multidimensional databasesyazad dumasia
 

Similar to Data ware house design (20)

The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL Server
 
Whats a datawarehouse
Whats a datawarehouseWhats a datawarehouse
Whats a datawarehouse
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
CS636-olap.ppt
CS636-olap.pptCS636-olap.ppt
CS636-olap.ppt
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Tableau Desktop Material
Tableau Desktop MaterialTableau Desktop Material
Tableau Desktop Material
 
SAS/Tableau integration
SAS/Tableau integrationSAS/Tableau integration
SAS/Tableau integration
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
ETL
ETL ETL
ETL
 
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!
Waiting too long for Excel's VLOOKUP? Use SQLite for simple data analysis!
 
Sql Server 2005 Business Inteligence
Sql Server 2005 Business InteligenceSql Server 2005 Business Inteligence
Sql Server 2005 Business Inteligence
 
Fact table design for data ware house
Fact table design for data ware houseFact table design for data ware house
Fact table design for data ware house
 
Fact table design for data ware house
Fact table design for data ware houseFact table design for data ware house
Fact table design for data ware house
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
 
Schemas for multidimensional databases
Schemas for multidimensional databasesSchemas for multidimensional databases
Schemas for multidimensional databases
 
It ready dw_day3_rev00
It ready dw_day3_rev00It ready dw_day3_rev00
It ready dw_day3_rev00
 

More from Sayed Ahmed

Workplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsWorkplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsSayed Ahmed
 
Python py charm anaconda jupyter installation and basic commands
Python py charm anaconda jupyter   installation and basic commandsPython py charm anaconda jupyter   installation and basic commands
Python py charm anaconda jupyter installation and basic commandsSayed Ahmed
 
[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic frameworkSayed Ahmed
 
Sap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSayed Ahmed
 
Will be an introduction to
Will be an introduction toWill be an introduction to
Will be an introduction toSayed Ahmed
 
Whm and cpanel overview hosting control panel overview
Whm and cpanel overview   hosting control panel overviewWhm and cpanel overview   hosting control panel overview
Whm and cpanel overview hosting control panel overviewSayed Ahmed
 
Web application development using zend framework
Web application development using zend frameworkWeb application development using zend framework
Web application development using zend frameworkSayed Ahmed
 
Web design and_html_part_3
Web design and_html_part_3Web design and_html_part_3
Web design and_html_part_3Sayed Ahmed
 
Web design and_html_part_2
Web design and_html_part_2Web design and_html_part_2
Web design and_html_part_2Sayed Ahmed
 
Web design and_html
Web design and_htmlWeb design and_html
Web design and_htmlSayed Ahmed
 
Visual studio ide shortcuts
Visual studio ide shortcutsVisual studio ide shortcuts
Visual studio ide shortcutsSayed Ahmed
 
Unit tests in_symfony
Unit tests in_symfonyUnit tests in_symfony
Unit tests in_symfonySayed Ahmed
 
Telerik this is sayed
Telerik this is sayedTelerik this is sayed
Telerik this is sayedSayed Ahmed
 
System analysis and_design
System analysis and_designSystem analysis and_design
System analysis and_designSayed Ahmed
 
Story telling and_narrative
Story telling and_narrativeStory telling and_narrative
Story telling and_narrativeSayed Ahmed
 

More from Sayed Ahmed (20)

Workplace, Data Analytics, and Ethics
Workplace, Data Analytics, and EthicsWorkplace, Data Analytics, and Ethics
Workplace, Data Analytics, and Ethics
 
Python py charm anaconda jupyter installation and basic commands
Python py charm anaconda jupyter   installation and basic commandsPython py charm anaconda jupyter   installation and basic commands
Python py charm anaconda jupyter installation and basic commands
 
[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework[not edited] Demo on mobile app development using ionic framework
[not edited] Demo on mobile app development using ionic framework
 
Sap hana-ide-overview-nodev
Sap hana-ide-overview-nodevSap hana-ide-overview-nodev
Sap hana-ide-overview-nodev
 
Invest wisely
Invest wiselyInvest wisely
Invest wisely
 
Will be an introduction to
Will be an introduction toWill be an introduction to
Will be an introduction to
 
Whm and cpanel overview hosting control panel overview
Whm and cpanel overview   hosting control panel overviewWhm and cpanel overview   hosting control panel overview
Whm and cpanel overview hosting control panel overview
 
Web application development using zend framework
Web application development using zend frameworkWeb application development using zend framework
Web application development using zend framework
 
Web design and_html_part_3
Web design and_html_part_3Web design and_html_part_3
Web design and_html_part_3
 
Web design and_html_part_2
Web design and_html_part_2Web design and_html_part_2
Web design and_html_part_2
 
Web design and_html
Web design and_htmlWeb design and_html
Web design and_html
 
Visual studio ide shortcuts
Visual studio ide shortcutsVisual studio ide shortcuts
Visual studio ide shortcuts
 
Virtualization
VirtualizationVirtualization
Virtualization
 
User interfaces
User interfacesUser interfaces
User interfaces
 
Unreal
UnrealUnreal
Unreal
 
Unit tests in_symfony
Unit tests in_symfonyUnit tests in_symfony
Unit tests in_symfony
 
Telerik this is sayed
Telerik this is sayedTelerik this is sayed
Telerik this is sayed
 
System analysis and_design
System analysis and_designSystem analysis and_design
System analysis and_design
 
Symfony 2
Symfony 2Symfony 2
Symfony 2
 
Story telling and_narrative
Story telling and_narrativeStory telling and_narrative
Story telling and_narrative
 

Recently uploaded

AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 

Recently uploaded (20)

AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 

Data ware house design

  • 1. SOFTWARE/WEB/MOBILE/DATABASE ARCHITECT, ENGINEER, AND DEVELOPER TORONTO, CANADA HTTP://SAYED.JUSTETC.NET HTTP://WWW.JUSTETC.NET Sayed Ahmed Logical Design of a Data Warehouse
  • 2. OUR SERVICES  Free Training and Educational Services  Training and Education in Bangla:  Bangla.SaLearningSchool.com  Training and Education in English:  www.SaLearningSchool.com  English.SaLearningSchool.com  Ask a question and get answers:  Ask.JustEtc.net
  • 3. TOPICS - KEYWORDS  Design a Data Warehouse  Star Schema  Snow Flake Schema  Dimension Tables  Fact Tables  Auditing  Surrogate Keys  Type 1, Type 2, Type 3, and Mixed solutions for slowly changing dimension data ( SCD management)  Pivoting for Analysis  To help with SSAS on data warehouse
  • 4. TOPICS - KEYWORDS  Design a Data Warehouse  Additive measures  Semi additive measures  Hierarchies for dimensions  Attributes in dimensions  Attributes in lookup tables  Long term data warehouse design  Usually Star Schema  Short term data warehouse design  POC  Usually snowflake schema
  • 5. TOPICS - KEYWORDS  Fact Tables  measures  foreign keys  and possibly an additional primary key  and lineage columns  granularity of fact tables  auditing and lineage needs  Measures can be  additive  non-additive  semi-additive
  • 6. TOPICS - KEYWORDS  dimension  keys  names  attributes  member properties  translations  and lineage
  • 7. TOPICS - KEYWORDS  attributes  natural hierarchies  many-to-many fact table relationships  you can introduce an additional intermediate dimension
  • 8. CONCLUSION  Not much – right  However, if you understand all the terms and can implement all these concepts in your data warehouse  That will be great  Not necessarily you will need to use all of these concepts; however, you may need to justify based on the situation, will all or any of these will help?  What will help and what will not help  Check our sub sequent videos and tutorials
  • 9. THANK YOU  Any Concerns?  http://ask.justetc.net  Or comment below...
  • 10. TOOLS AND SOFTWARE REQUIREMENTS  Download the Adventure Works databases  OLTP database (LOB database)  Data warehouse Database  From  http://msftdbprodsamples.codeplex.com/releases/view/55330  For this tutorial, you can just check our slides  Though the following tools will help  And probably check the details in the downloaded databases esp. The AdventureWorksDW2012  You will need help from SQL Server and SQL Server MGMT Studio Tools
  • 11. REQUIRED TOOLS  Useful/Required SQL Server Components  Database Engine Services  Documentation Components  Management Tools - Basic  Management Tools – Complete  SQL Server Data Tools
  • 12. DATA WAREHOUSE DESIGN – THE DETAILS  Data Warehouse Logical Design  Topics: Design and Implement a Data Warehouse  Design and implement dimensions.  Design and implement fact tables  Design Auditing  track the source and time for data coming into a DW through auditing i.e lineage information  Why a Data Warehouse?  It is hard to  generate reports from OLTP/LOB/Transactional database  To do Analysis on OLTP database data (some times)  Get useful information/useful summarized and details data to be used to take business decisions
  • 13. DATA WAREHOUSE DESIGN – THE DETAILS  Why a Data Warehouse?  Data in OLTP are heavily normalized. The goal was to keep one data only in one single place to reduce redundancy and consistency of data  You may end up with many tables 100s, 1000s  To generate reports you may need to join many tables – will be slow  Historical data may not be there  Data quality is also an issue  For reporting or analyzing, you may need data from multiple databases across many departments
  • 14. WHY A DATA WAREHOUSE?  So you can create a Data Warehouse  By cleaning data  With historical data  Combining data from multiple sources  Denormalizing data  Using specific design geared towards Data Warehouse design  Some or many consider DW design is less complex than relational database design  Though it also has some complex areas to address... (by those some or many)
  • 15. SO WHAT DOES A DATA WAREHOUSE CONTAIN?  Usually two schemas are used for a DW  Star Schema-> looks like a star  Snow Flake Schema  Another one called Dimensional Model  Includes both Star and Snow Flake in the same Data Warehouse  Both Schemas has tables of two types  Dimension Tables  Fact Tables
  • 16. SO WHAT DOES A DATA WAREHOUSE CONTAIN?  Fact Tables are in the center  A Fact table joins/combines all the data required for this reporting or for the business aspect of this reporting  Usually combines the primary keys of different tables that contain data for this report/business aspect  Dimension tables are all the other tables that contain actual data  Dimension tables are the tables that contain data  these can be the actual tables in the OLTP database without any modification (Snow Flake)  Or Dimension tables can be newly created by denormalizing the existing OLTP databases (Star)
  • 17. SO WHAT DOES A DATA WAREHOUSE CONTAIN?  So, you know now what are dimension tables and what are fact tables  Fact tables contain primary keys of all related tables (here they are foreign keys)  Dimension tables contain data  Usually, it’s better that you keep your data warehouse separate from your OLTP database  So bring all the tables (dimension) here  Or denormalize them and bring them here in the new database
  • 18. SIMPLIFIED: WHAT ARE STAR AND SNOWFLAKE SCHEMAS  If you just create Fact tables and take all the related tables from your OLTP/LOB databases  You get a Snow Flake Schema  Here all Dimension tables are still normalized (as you just took them from the actual database)  This is easy –  so good for short-term, quick, and experimental Data Warehouse  One note, your reporting and analysis services queries (MDX, DMV) will be slow with Snow Flake Schemas
  • 19. SO WHAT DOES A DATA WAREHOUSE CONTAIN?  Now, when you denormalize the dimension tables  You get the start schema  The Fact tables remain the same for example  Star Schema is kind of standard and used a lot  Originally was developed in 1980’s
  • 20. EXAMPLES: WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA Sales amount for internet sales by different countries and historical years
  • 21. WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA  issues that I did not mention before  If your OLTP database was well designed (?)  It may be hard to find the tables related to the reporting  The table names and the column names can be tricky – do not follow any conventions – do not have meaning  So it can be hard to find data for the reporting
  • 22. WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA  Note: Reality:  The OLTP may not even be well designed (that makes reporting hard sometimes) even the relationships as well as normalization  – here we assumed that OLTP is perfect  In a long back project  I had to re-write/verify/check/change/optimize/had to deal with (whatever you say) 100s (not really 100s, can be close to 100) of queries for a reporting system  Had to change the interface from one button for one report (easy to get lost)  Into a drop down list of reports  The relations among data were arbitrary – actually had only in the mind of the designer – did not follow any standards – No ER – no standard concepts---  So it was a hard job..  Anyway..
  • 23. WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA  In such cases  Tools such as SQL Profiler might help  you could create a test environment,  try to insert some data through an LOB application  have SQL Profiler identify where the data was inserted  Another, issue with this particular example  No lookup for dates and years  You need to extract  The tables may not contain even historical data  No date field  So no historical data
  • 24. WHY REPORTING IN OLTP DATABASE IS NOT A GREAT IDEA  If sales data reside in multiple databases even by multiple departments  How do you merge  Identify and match  Customer data can be in different database with no common identification  Data quality can be low  Data missing  Partial data  Inconsistent data in multiple databases  Data can be represented differenlt in different database  M or F for gender  1 or 0 for gender
  • 26. TOTAL DW: MULTIPLE STAR SCHEMAS  You saw one Star Schema for Internet Sales  You can see another for Offline Sales  Another for Accounting  Your DW has many such Star Schemas  And these start schemas need to be connected/related  They will be connected when you use the same dimensions for them  i.e. If two star schemas have the same dimension they can share that dimension  Called: shared or conformed dimensions  For SSAS, you can use shared dimensions only  There is a concept of private dimension  Not a great idea in practical and real life applications  You cannot connect/compare/verify the data over the shared dimension
  • 29. SNOW FLAKES WILL BE MORE AND MORE NORMALIZED  Everything can be normalized  Or the first level can be normalized others are not
  • 30. NORMALIZED PRODUCT DIMENSION In the Star Schema, you could use these normalized product table to get snow flake schema (partially.) Could use all normalized dimensions to get full snow flake
  • 31. SNOW FLAKE  In Snow flake, you may see partial than full snow flakes in reality  Though, in reality, better to go for star schema  Queries will be faster
  • 33. GRANULARITY  The number of Dimension Tables connected to a fact table  Dimension of a star schema  Cube = 3 dimension  SSAS operates/analyzes on Cube
  • 34. AUDITING AND LINEAGE  I will be very short on this  In data warehouse, you may want some auditing tables  For every update, you should audit  who made the update,  when it was made,  and how many rows were transferred  to each dimension and  fact table  in your DW
  • 35. AUDITING AND LINEAGE  You will need additional fields/columns in your dimension and fact tables to track  When, and who, and from where the row data was/were updated  Your ETL process needs to be updated  If you used SSIS for the ETL  Modify SSIS packages so that you can record these information
  • 36. THANK YOU  Any Concerns?  http://ask.justetc.net  Or comment below...
  • 37. DESIGNING DIMENSIONS  Keys . Used to identify entities  Name columns . Used for human names of entities  Attributes . Used for pivoting in analyses  Member properties . Used for labels in a report  Lineage columns . Used for auditing, and never exposed to end users
  • 38.  For analysis  Pivot Table  Pivot Graph  For Dimensions  The fields used as for pivoting are called  Attributes  Not all columns are attributes  Attributes: based on what analysis are done  In previous, slide you saw the different types of columns
  • 39.  Attributes  For pivoting, discrete attributes with a small number of distinct values is most appropriate  Should not be continuous  Keys are not good candidates for pivoting and analysis  To make continous column for pivoting  Concert/utilize it as a small set of discrete values
  • 40.  SSAS can discretize continuous attributes.  Not always great – need business perspecyive as well  Age and Income are not good candidates for auto discretize  Naming columns to identify the entity  Not good for pivoting or keys  Address such as  Columns used in reports as labels only, not for pivoting, are called member properties.  Can include translations
  • 41.  Lineage and auditing columns  Used for auditing data  Never exposed to the users
  • 42.
  • 43.  Possible Attributes  BirthDate (after calculating age and discretizing the age)  MaritalStatus  Gender  YearlyIncome (after discretizing)  TotalChildren  NumberChildrenAtHome  EnglishEducation (other education columns are for translations)  EnglishOccupation (other occupation columns are for translations)  HouseOwnerFlag  NumberCarsOwned  CommuteDistance
  • 44.
  • 45.  FullDateAlternateKey (denotes a date in date format)  EnglishMonthName  CalendarQuarter  CalendarSemester  CalendarYear  Drill Down attributes  CalendarYear →CalendarSemester → CalendarQu arter → EnglishMonthName → FullDateAlternateKey .
  • 46.  why dimension columns used in reports for labels are called member properties.  In a Snowflake schema, lookup tables show you levels of hierarchies. In a Star schema, you need to extract natural hierarchies from the names and content of columns. Nevertheless, because drilling down through natural hierarchies is so useful and welcomed by end users, you should use them as much as possible.
  • 47. SLOWLY CHANGING DIMENSIONS  Type 1  History lost  Type 2  Keeps all history  Type 3  Keeps partial history  You can use a combination  For some columns type1 for others type 2
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. DESIGNING FACT TABLES  Fact tables include measures, foreign keys, and possibly an additional primary key and lineage columns.  Measures can be additive, non-additive, or semi-additive.  For many-to-many relationships, you can introduce an additional intermediate dimension.
  • 53.  Fact tables  Collection of measurements on a specific aspects of business  Measure columns  sales amount, order quantity, and discount amount.