SlideShare a Scribd company logo
Data Warehousing
De-normalization
1
Ch Anwar ul Hassan (Lecturer)
Department of Computer Science and Software
Engineering
Capital University of Sciences & Technology, Islamabad
Pakistan
anwarchaudary@gmail.com
2
Striking a balance between “good” & “evil”
Flat Table
Data Lists
Data Cubes 1st Normal Form
2nd Normal Form
3rd Normal Form
4+ Normal Forms
NormalizationDe-normalization
One big flat file
Too many tables
3
What is De-normalization?
 It is not chaos, more like a “controlled crash”
with the aim of performance enhancement
without loss of information.
 Normalization is a rule of thumb in DBMS,
but in DSS ease of use is achieved by way of
denormalization.
 De-normalization comes in many flavors,
such as combining tables, splitting tables,
adding data etc., but all done very carefully.
4
Why De-normalization In DSS?
 Bringing “close” dispersed but related data
items.
 Query performance in DSS significantly
dependent on physical data model.
 Very early studies showed performance
difference in orders of magnitude for different
number de-normalized tables and rows per table.
 The level of de-normalization should be
carefully considered.
5
How De-normalization improves performance?
De-normalization specifically improves
performance by either:
 Reducing the number of tables and hence the
reliance on joins, which consequently speeds up
performance.
 Reducing the number of joins required during
query execution
 Reducing the number of rows to be retrieved from
the Primary Data Table.
6
4 Guidelines for De-normalization
1. Carefully do a cost-benefit analysis
(frequency of use, additional storage,
join time).
2. Do a data requirement and storage
analysis.
3. Weigh against the maintenance issue
of the redundant data (triggers used).
4. When in doubt, don’t denormalize.
7
Areas for Applying De-Normalization Techniques
 Dealing with the abundance of star schemas.
 Fast access of time series data for analysis.
 Fast aggregate (sum, average etc.) results and
complicated calculations.
 Multidimensional analysis (e.g. geography) in a complex
hierarchy.
 Dealing with few updates but many join queries.
De-normalization will ultimately affect the database size and
query performance.
 Star Schema, the center of the star can have one fact table and a
number of associated dimension tables. It is known as star schema
as its structure resembles a star. The star schema is the simplest
type of Data Warehouse schema. It is also known as Star Join
Schema and is optimized for querying large data sets.
8
Star Schema
 Snowflake Schema is an extension of a Star Schema,
and it adds additional dimensions. It is called snowflake
because its diagram resembles a Snowflake.
9
Star Schema
10
Five principal De-normalization techniques
1. Collapsing Tables.
- Two entities with a One-to-One relationship.
- Two entities with a Many-to-Many relationship.
2. Splitting Tables (Horizontal/Vertical Splitting).
3. Pre-Joining.
4. Adding Redundant Columns (Reference Data).
To eliminate joins for many queries
5. Derived Attributes (Age, Total, Balance etc).
11
De-normalization Techniques
12
Collapsing Tables
ColA ColB
ColA ColC
normalized
ColA ColB ColC
denormalized
 Reduced storage space.
 Reduced update time.
 Does not changes business view.
13
Splitting Tables
ColA ColB ColC
Table
Vertical Split
ColA ColB ColA ColC
Table_v1 Table_v2
ColA ColB ColC
Horizontal split
ColA ColB ColC
Table_h1 Table_h2
14
Splitting Tables: Horizontal splitting…
Breaks a table into multiple tables based upon
common column values. Example: Campus specific
queries.
GOAL
 Spreading rows for exploiting parallelism.
 Grouping data to avoid unnecessary query load in
WHERE clause.
15
Splitting Tables: Horizontal splitting
ADVANTAGE
 Enhance security of data.
 Organizing tables differently for different queries.
 Graceful degradation of database in case of table
damage.
Fast data retrieval.
16
Splitting Tables: Vertical Splitting
 Infrequently accessed columns become extra
“baggage” thus degrading performance.
Very useful for rarely accessed large text columns
with large headers.
 Header size is reduced, allowing more rows per
block, thus reducing I/O.
Splitting and distributing into separate files with
repeating primary key.
 For an end user, the split appears as a single table
through a view.
17
Pre-joining …
 Identify frequent joins and append the tables
together in the physical data model.
 Generally used for 1:M such as master-
detail.
 Additional space is required as the master
information is repeated in the new header
table.
18
Pre-Joining…
normalized
Tx_ID Sale_ID Item_ID Item_Qty Sale_Rs
Tx_ID Sale_ID Item_ID Item_Qty Sale_RsSale_dateSale_person
denormalized
Sale_IDSale_dateSale_person
Master
Detail
1 M
19
Pre-Joining: Typical Scenario
 Typical of Market basket query
 Join ALWAYS required
 Tables could be millions of rows
 Squeeze Master into Detail
 Repetition of facts. How much?
 Detail 3-4 times of master
20
Adding Redundant Columns…
ColA ColB
Table_1
ColA ColC ColD … ColZ
Table_2
ColA ColB
Table_1’
ColA ColC ColD … ColZ
Table_2
ColC
21
Adding Redundant Columns…
Columns can also be moved, instead of making them
redundant. Very similar to pre-joining as discussed
earlier.
EXAMPLE
Frequent referencing of code in one table and
corresponding description in another table.
 A join is required.

22
Derived Attributes: Example
Age is also a derived attribute, calculated as Current_Date
– DoB (calculated periodically).
GP (Grade Point) column in the data warehouse data
model is included as a derived value. The formula for
calculating this field is Grade*Credits.
#SID
DoB
Degree
Course
Grade
Credits
Business Data Model
#SID
DoB
Degree
Course
Grade
Credits
GP
Age
DWH Data Model
Derived attributes
 Calculated once
 Used Frequently
DoB: Date of Birth

More Related Content

What's hot

L4 working with tables and data
L4 working with tables and dataL4 working with tables and data
L4 working with tables and data
Bryan Corpuz
 
Lecture1 data structure(introduction)
Lecture1 data structure(introduction)Lecture1 data structure(introduction)
Dwh lecture-07-denormalization
Dwh lecture-07-denormalizationDwh lecture-07-denormalization
Dwh lecture-07-denormalization
Sulman Ahmed
 
Difference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingDifference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingAbdul Aslam
 
P REFIX - BASED L ABELING A NNOTATION FOR E FFECTIVE XML F RAGMENTATION
P REFIX - BASED  L ABELING  A NNOTATION FOR  E FFECTIVE  XML F RAGMENTATIONP REFIX - BASED  L ABELING  A NNOTATION FOR  E FFECTIVE  XML F RAGMENTATION
P REFIX - BASED L ABELING A NNOTATION FOR E FFECTIVE XML F RAGMENTATION
ijcsit
 
A short introduction to database systems.ppt
A short introduction to  database systems.pptA short introduction to  database systems.ppt
A short introduction to database systems.pptMuruly Krishan
 
8 i index_tables
8 i index_tables8 i index_tables
8 i index_tables
Anil Pandey
 
Normalization
NormalizationNormalization
Normalization
Masud Parves
 
Research design
Research designResearch design
Research design
Harish M H
 
Database Relationships
Database RelationshipsDatabase Relationships
Database Relationshipswmassie
 
Introduction to databases
Introduction to databasesIntroduction to databases
Introduction to databases
Bryan Corpuz
 
Database Basics
Database BasicsDatabase Basics
Database Basics
ProdigyView
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
Er. Nawaraj Bhandari
 
Database Basics
Database BasicsDatabase Basics
Database Basics
Abdel Moneim Emad
 
Phd coursestatalez2datamanagement
Phd coursestatalez2datamanagementPhd coursestatalez2datamanagement
Phd coursestatalez2datamanagement
Marco Delogu
 
Data Types And Field Properties
Data  Types And  Field  PropertiesData  Types And  Field  Properties
Data Types And Field Propertieswmassie
 
5.01 database-fundamentals
5.01 database-fundamentals5.01 database-fundamentals
5.01 database-fundamentals
Tammy Carter
 
Data types and field properties
Data types and field propertiesData types and field properties
Data types and field properties
Tammy Carter
 

What's hot (20)

L4 working with tables and data
L4 working with tables and dataL4 working with tables and data
L4 working with tables and data
 
Lecture1 data structure(introduction)
Lecture1 data structure(introduction)Lecture1 data structure(introduction)
Lecture1 data structure(introduction)
 
Dwh lecture-07-denormalization
Dwh lecture-07-denormalizationDwh lecture-07-denormalization
Dwh lecture-07-denormalization
 
Difference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingDifference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional Modeling
 
P REFIX - BASED L ABELING A NNOTATION FOR E FFECTIVE XML F RAGMENTATION
P REFIX - BASED  L ABELING  A NNOTATION FOR  E FFECTIVE  XML F RAGMENTATIONP REFIX - BASED  L ABELING  A NNOTATION FOR  E FFECTIVE  XML F RAGMENTATION
P REFIX - BASED L ABELING A NNOTATION FOR E FFECTIVE XML F RAGMENTATION
 
A short introduction to database systems.ppt
A short introduction to  database systems.pptA short introduction to  database systems.ppt
A short introduction to database systems.ppt
 
8 i index_tables
8 i index_tables8 i index_tables
8 i index_tables
 
Lists
ListsLists
Lists
 
Lect 1-2
Lect 1-2Lect 1-2
Lect 1-2
 
Normalization
NormalizationNormalization
Normalization
 
Research design
Research designResearch design
Research design
 
Database Relationships
Database RelationshipsDatabase Relationships
Database Relationships
 
Introduction to databases
Introduction to databasesIntroduction to databases
Introduction to databases
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
Phd coursestatalez2datamanagement
Phd coursestatalez2datamanagementPhd coursestatalez2datamanagement
Phd coursestatalez2datamanagement
 
Data Types And Field Properties
Data  Types And  Field  PropertiesData  Types And  Field  Properties
Data Types And Field Properties
 
5.01 database-fundamentals
5.01 database-fundamentals5.01 database-fundamentals
5.01 database-fundamentals
 
Data types and field properties
Data types and field propertiesData types and field properties
Data types and field properties
 

Similar to Intro to Data warehousing Lecture 04

Dwh lecture 07-denormalization
Dwh   lecture 07-denormalizationDwh   lecture 07-denormalization
Dwh lecture 07-denormalization
Sulman Ahmed
 
denormalization.ppt
denormalization.pptdenormalization.ppt
denormalization.ppt
ABUSUFYAN55
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of DenormalizationAliya Saldanha
 
Cs437 lecture 1-6
Cs437 lecture 1-6Cs437 lecture 1-6
Cs437 lecture 1-6
Aneeb_Khawar
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designCalpont
 
Dwh lecture 12-dm
Dwh lecture 12-dmDwh lecture 12-dm
Dwh lecture 12-dm
Sulman Ahmed
 
Advance sqlite3
Advance sqlite3Advance sqlite3
Advance sqlite3Raghu nath
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
Architectural anti-patterns for data handling
Architectural anti-patterns for data handlingArchitectural anti-patterns for data handling
Architectural anti-patterns for data handling
Gleicon Moraes
 
NoSQL - A Closer Look to Couchbase
NoSQL - A Closer Look to CouchbaseNoSQL - A Closer Look to Couchbase
NoSQL - A Closer Look to Couchbase
Mohammad Shaker
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8
Shani729
 
very large database
very large databasevery large database
very large database
Kazem Taghandiky
 
White paper on Spool space in teradata
White paper on Spool space in teradataWhite paper on Spool space in teradata
White paper on Spool space in teradata
Sanjeev Kumar Jaiswal
 
RDMS AND SQL
RDMS AND SQLRDMS AND SQL
RDMS AND SQL
milanmehta7
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational model
Chirag vasava
 
1414Database DesignDatabase design is the process o.docx
 1414Database DesignDatabase design is the process o.docx 1414Database DesignDatabase design is the process o.docx
1414Database DesignDatabase design is the process o.docx
joyjonna282
 
Intro to Data warehousing lecture 08
Intro to Data warehousing   lecture 08Intro to Data warehousing   lecture 08
Intro to Data warehousing lecture 08
AnwarrChaudary
 
CIS 515 discussion post responses.There are two discussions he.docx
CIS 515 discussion post responses.There are two discussions he.docxCIS 515 discussion post responses.There are two discussions he.docx
CIS 515 discussion post responses.There are two discussions he.docx
sleeperharwell
 
Open Source Datawarehouse
Open Source DatawarehouseOpen Source Datawarehouse
Open Source Datawarehouse
عباس بني اسدي مقدم
 

Similar to Intro to Data warehousing Lecture 04 (20)

Dwh lecture 07-denormalization
Dwh   lecture 07-denormalizationDwh   lecture 07-denormalization
Dwh lecture 07-denormalization
 
denormalization.ppt
denormalization.pptdenormalization.ppt
denormalization.ppt
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
 
Cs437 lecture 1-6
Cs437 lecture 1-6Cs437 lecture 1-6
Cs437 lecture 1-6
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
Dwh lecture 12-dm
Dwh lecture 12-dmDwh lecture 12-dm
Dwh lecture 12-dm
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
Advance sqlite3
Advance sqlite3Advance sqlite3
Advance sqlite3
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Architectural anti-patterns for data handling
Architectural anti-patterns for data handlingArchitectural anti-patterns for data handling
Architectural anti-patterns for data handling
 
NoSQL - A Closer Look to Couchbase
NoSQL - A Closer Look to CouchbaseNoSQL - A Closer Look to Couchbase
NoSQL - A Closer Look to Couchbase
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8
 
very large database
very large databasevery large database
very large database
 
White paper on Spool space in teradata
White paper on Spool space in teradataWhite paper on Spool space in teradata
White paper on Spool space in teradata
 
RDMS AND SQL
RDMS AND SQLRDMS AND SQL
RDMS AND SQL
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational model
 
1414Database DesignDatabase design is the process o.docx
 1414Database DesignDatabase design is the process o.docx 1414Database DesignDatabase design is the process o.docx
1414Database DesignDatabase design is the process o.docx
 
Intro to Data warehousing lecture 08
Intro to Data warehousing   lecture 08Intro to Data warehousing   lecture 08
Intro to Data warehousing lecture 08
 
CIS 515 discussion post responses.There are two discussions he.docx
CIS 515 discussion post responses.There are two discussions he.docxCIS 515 discussion post responses.There are two discussions he.docx
CIS 515 discussion post responses.There are two discussions he.docx
 
Open Source Datawarehouse
Open Source DatawarehouseOpen Source Datawarehouse
Open Source Datawarehouse
 

More from AnwarrChaudary

Intro to Data warehousing lecture 20
Intro to Data warehousing   lecture 20Intro to Data warehousing   lecture 20
Intro to Data warehousing lecture 20
AnwarrChaudary
 
Intro to Data warehousing lecture 19
Intro to Data warehousing   lecture 19Intro to Data warehousing   lecture 19
Intro to Data warehousing lecture 19
AnwarrChaudary
 
Intro to Data warehousing lecture 18
Intro to Data warehousing   lecture 18Intro to Data warehousing   lecture 18
Intro to Data warehousing lecture 18
AnwarrChaudary
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17
AnwarrChaudary
 
Intro to Data warehousing lecture 16
Intro to Data warehousing   lecture 16Intro to Data warehousing   lecture 16
Intro to Data warehousing lecture 16
AnwarrChaudary
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
AnwarrChaudary
 
Intro to Data warehousing lecture 14
Intro to Data warehousing   lecture 14Intro to Data warehousing   lecture 14
Intro to Data warehousing lecture 14
AnwarrChaudary
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13
AnwarrChaudary
 
Intro to Data warehousing lecture 12
Intro to Data warehousing   lecture 12Intro to Data warehousing   lecture 12
Intro to Data warehousing lecture 12
AnwarrChaudary
 
Intro to Data warehousing lecture 11
Intro to Data warehousing   lecture 11Intro to Data warehousing   lecture 11
Intro to Data warehousing lecture 11
AnwarrChaudary
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
AnwarrChaudary
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09
AnwarrChaudary
 
Intro to Data warehousing lecture 07
Intro to Data warehousing   lecture 07Intro to Data warehousing   lecture 07
Intro to Data warehousing lecture 07
AnwarrChaudary
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06
AnwarrChaudary
 
Intro to Data warehousing lecture 05
Intro to Data warehousing   lecture 05Intro to Data warehousing   lecture 05
Intro to Data warehousing lecture 05
AnwarrChaudary
 
Intro to Data warehousing lecture 03
Intro to Data warehousing   lecture 03Intro to Data warehousing   lecture 03
Intro to Data warehousing lecture 03
AnwarrChaudary
 
Intro to Data warehousing lecture 02
Intro to Data warehousing   lecture 02Intro to Data warehousing   lecture 02
Intro to Data warehousing lecture 02
AnwarrChaudary
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
AnwarrChaudary
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
AnwarrChaudary
 
PDCA Plan Do Check Act
PDCA Plan Do Check ActPDCA Plan Do Check Act
PDCA Plan Do Check Act
AnwarrChaudary
 

More from AnwarrChaudary (20)

Intro to Data warehousing lecture 20
Intro to Data warehousing   lecture 20Intro to Data warehousing   lecture 20
Intro to Data warehousing lecture 20
 
Intro to Data warehousing lecture 19
Intro to Data warehousing   lecture 19Intro to Data warehousing   lecture 19
Intro to Data warehousing lecture 19
 
Intro to Data warehousing lecture 18
Intro to Data warehousing   lecture 18Intro to Data warehousing   lecture 18
Intro to Data warehousing lecture 18
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17
 
Intro to Data warehousing lecture 16
Intro to Data warehousing   lecture 16Intro to Data warehousing   lecture 16
Intro to Data warehousing lecture 16
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
Intro to Data warehousing lecture 14
Intro to Data warehousing   lecture 14Intro to Data warehousing   lecture 14
Intro to Data warehousing lecture 14
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13
 
Intro to Data warehousing lecture 12
Intro to Data warehousing   lecture 12Intro to Data warehousing   lecture 12
Intro to Data warehousing lecture 12
 
Intro to Data warehousing lecture 11
Intro to Data warehousing   lecture 11Intro to Data warehousing   lecture 11
Intro to Data warehousing lecture 11
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09
 
Intro to Data warehousing lecture 07
Intro to Data warehousing   lecture 07Intro to Data warehousing   lecture 07
Intro to Data warehousing lecture 07
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06
 
Intro to Data warehousing lecture 05
Intro to Data warehousing   lecture 05Intro to Data warehousing   lecture 05
Intro to Data warehousing lecture 05
 
Intro to Data warehousing lecture 03
Intro to Data warehousing   lecture 03Intro to Data warehousing   lecture 03
Intro to Data warehousing lecture 03
 
Intro to Data warehousing lecture 02
Intro to Data warehousing   lecture 02Intro to Data warehousing   lecture 02
Intro to Data warehousing lecture 02
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 
PDCA Plan Do Check Act
PDCA Plan Do Check ActPDCA Plan Do Check Act
PDCA Plan Do Check Act
 

Recently uploaded

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 

Recently uploaded (20)

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 

Intro to Data warehousing Lecture 04

  • 1. Data Warehousing De-normalization 1 Ch Anwar ul Hassan (Lecturer) Department of Computer Science and Software Engineering Capital University of Sciences & Technology, Islamabad Pakistan anwarchaudary@gmail.com
  • 2. 2 Striking a balance between “good” & “evil” Flat Table Data Lists Data Cubes 1st Normal Form 2nd Normal Form 3rd Normal Form 4+ Normal Forms NormalizationDe-normalization One big flat file Too many tables
  • 3. 3 What is De-normalization?  It is not chaos, more like a “controlled crash” with the aim of performance enhancement without loss of information.  Normalization is a rule of thumb in DBMS, but in DSS ease of use is achieved by way of denormalization.  De-normalization comes in many flavors, such as combining tables, splitting tables, adding data etc., but all done very carefully.
  • 4. 4 Why De-normalization In DSS?  Bringing “close” dispersed but related data items.  Query performance in DSS significantly dependent on physical data model.  Very early studies showed performance difference in orders of magnitude for different number de-normalized tables and rows per table.  The level of de-normalization should be carefully considered.
  • 5. 5 How De-normalization improves performance? De-normalization specifically improves performance by either:  Reducing the number of tables and hence the reliance on joins, which consequently speeds up performance.  Reducing the number of joins required during query execution  Reducing the number of rows to be retrieved from the Primary Data Table.
  • 6. 6 4 Guidelines for De-normalization 1. Carefully do a cost-benefit analysis (frequency of use, additional storage, join time). 2. Do a data requirement and storage analysis. 3. Weigh against the maintenance issue of the redundant data (triggers used). 4. When in doubt, don’t denormalize.
  • 7. 7 Areas for Applying De-Normalization Techniques  Dealing with the abundance of star schemas.  Fast access of time series data for analysis.  Fast aggregate (sum, average etc.) results and complicated calculations.  Multidimensional analysis (e.g. geography) in a complex hierarchy.  Dealing with few updates but many join queries. De-normalization will ultimately affect the database size and query performance.
  • 8.  Star Schema, the center of the star can have one fact table and a number of associated dimension tables. It is known as star schema as its structure resembles a star. The star schema is the simplest type of Data Warehouse schema. It is also known as Star Join Schema and is optimized for querying large data sets. 8 Star Schema
  • 9.  Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. It is called snowflake because its diagram resembles a Snowflake. 9 Star Schema
  • 10. 10 Five principal De-normalization techniques 1. Collapsing Tables. - Two entities with a One-to-One relationship. - Two entities with a Many-to-Many relationship. 2. Splitting Tables (Horizontal/Vertical Splitting). 3. Pre-Joining. 4. Adding Redundant Columns (Reference Data). To eliminate joins for many queries 5. Derived Attributes (Age, Total, Balance etc).
  • 12. 12 Collapsing Tables ColA ColB ColA ColC normalized ColA ColB ColC denormalized  Reduced storage space.  Reduced update time.  Does not changes business view.
  • 13. 13 Splitting Tables ColA ColB ColC Table Vertical Split ColA ColB ColA ColC Table_v1 Table_v2 ColA ColB ColC Horizontal split ColA ColB ColC Table_h1 Table_h2
  • 14. 14 Splitting Tables: Horizontal splitting… Breaks a table into multiple tables based upon common column values. Example: Campus specific queries. GOAL  Spreading rows for exploiting parallelism.  Grouping data to avoid unnecessary query load in WHERE clause.
  • 15. 15 Splitting Tables: Horizontal splitting ADVANTAGE  Enhance security of data.  Organizing tables differently for different queries.  Graceful degradation of database in case of table damage. Fast data retrieval.
  • 16. 16 Splitting Tables: Vertical Splitting  Infrequently accessed columns become extra “baggage” thus degrading performance. Very useful for rarely accessed large text columns with large headers.  Header size is reduced, allowing more rows per block, thus reducing I/O. Splitting and distributing into separate files with repeating primary key.  For an end user, the split appears as a single table through a view.
  • 17. 17 Pre-joining …  Identify frequent joins and append the tables together in the physical data model.  Generally used for 1:M such as master- detail.  Additional space is required as the master information is repeated in the new header table.
  • 18. 18 Pre-Joining… normalized Tx_ID Sale_ID Item_ID Item_Qty Sale_Rs Tx_ID Sale_ID Item_ID Item_Qty Sale_RsSale_dateSale_person denormalized Sale_IDSale_dateSale_person Master Detail 1 M
  • 19. 19 Pre-Joining: Typical Scenario  Typical of Market basket query  Join ALWAYS required  Tables could be millions of rows  Squeeze Master into Detail  Repetition of facts. How much?  Detail 3-4 times of master
  • 20. 20 Adding Redundant Columns… ColA ColB Table_1 ColA ColC ColD … ColZ Table_2 ColA ColB Table_1’ ColA ColC ColD … ColZ Table_2 ColC
  • 21. 21 Adding Redundant Columns… Columns can also be moved, instead of making them redundant. Very similar to pre-joining as discussed earlier. EXAMPLE Frequent referencing of code in one table and corresponding description in another table.  A join is required. 
  • 22. 22 Derived Attributes: Example Age is also a derived attribute, calculated as Current_Date – DoB (calculated periodically). GP (Grade Point) column in the data warehouse data model is included as a derived value. The formula for calculating this field is Grade*Credits. #SID DoB Degree Course Grade Credits Business Data Model #SID DoB Degree Course Grade Credits GP Age DWH Data Model Derived attributes  Calculated once  Used Frequently DoB: Date of Birth