Upcoming SlideShare
×

03 rdbms session03

714 views

Published on

Published in: Technology
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
714
On SlideShare
0
From Embeds
0
Number of Embeds
76
Actions
Shares
0
0
0
Likes
2
Embeds 0
No embeds

No notes for slide
• Students have learnt the structure of different types of dimensions and the importance of surrogate keys in Module I. In this session, students will learn to load the data into the dimension tables after the data has been transformed in the transformation phase. In addition, students will also learn to update data into these dimension tables. Students already know about different types of dimension tables. Therefore, you can start the session by recapitulating the concepts. Initiate the class by asking the following questions: 1. What are the different types of dimensions? 2. Define flat dimension. 3. What are conformed dimension? 4. Define large dimension. 5. Define small dimension. 6. What is the importance of surrogate key in a dimension table? Students will learn the loading and update strategies theoretically in this session. The demonstration to load and update the data in the dimension table will be covered in next session.
• Students know the importance of surrogate keys. In this session students will learn the strategy to generate the surrogate key. Give an example to explain the strategy to generate the surrogate keys by concatenating the primary key of the source table with the date stamp. For example, data from a Product table has to be loaded into the Product_Dim dimension table on Feb 09, 2006. The product_code is the primary key column in the Product table. To insert the surrogate key values before loading the data into the dimension table, you can combine the primary key value with the date on which the data has to be loaded. In this case the surrogate key value can be product_code+09022006.
• Students have learnt the structure of different types of dimensions and the importance of surrogate keys in Module I. In this session, students will learn to load the data into the dimension tables after the data has been transformed in the transformation phase. In addition, students will also learn to update data into these dimension tables. Students already know about different types of dimension tables. Therefore, you can start the session by recapitulating the concepts. Initiate the class by asking the following questions: 1. What are the different types of dimensions? 2. Define flat dimension. 3. What are conformed dimension? 4. Define large dimension. 5. Define small dimension. 6. What is the importance of surrogate key in a dimension table? Students will learn the loading and update strategies theoretically in this session. The demonstration to load and update the data in the dimension table will be covered in next session.
• Explain the strategy to load the data into conformed dimension with help of an example to load the data in the date dimension given in SG. Explain the strategy to load the data into the Snowflaked dimension with the help of an example given in SG.
• Explain the strategy to load the data into conformed dimension with help of an example to load the data in the date dimension given in SG. Explain the strategy to load the data into the Snowflaked dimension with the help of an example given in SG.
• Explain the strategy to load the data into conformed dimension with help of an example to load the data in the date dimension given in SG. Explain the strategy to load the data into the Snowflaked dimension with the help of an example given in SG.
• Explain the strategy to load the data into conformed dimension with help of an example to load the data in the date dimension given in SG. Explain the strategy to load the data into the Snowflaked dimension with the help of an example given in SG.
• Explain the strategy to load the data into conformed dimension with help of an example to load the data in the date dimension given in SG. Explain the strategy to load the data into the Snowflaked dimension with the help of an example given in SG.
• Student already have learnt about SCDs in Module I. Therefore, you can start this topic by asking the following questions to students: What are type 1 SCDs? Given an example to explain type 1 SCDs. This will recapitulate what they have learnt about type 1 SCD in Module 1. Now explain the strategy to load the data into these dimension tables with help of the given diagram. Relate this diagram to the example given in SG.
• You can summarize the session by running through the summary given in SG. In addition, you can also ask students summarize what they have learnt in this session.
• You can summarize the session by running through the summary given in SG. In addition, you can also ask students summarize what they have learnt in this session.
• 03 rdbms session03

1. 1. Relational Database Management SystemsObjectives • In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate the need for denormalization Ver. 1.0 Session 3 Slide 1 of 15
2. 2. Relational Database Management SystemsUnderstanding Data Redundancy Redundancy means repetition of data. Redundancy increases the time involved in updating, adding, and deleting data. It also increases the utilization of disk space and hence, disk I/O increases. Redundancy can lead to the following problems: Inserting, modifying, and deleting data may cause inconsistencies Errors are more likely to occur when facts are repeated Unnecessary utilization of extra disk space Ver. 1.0 Session 3 Slide 2 of 15
3. 3. Relational Database Management SystemsDefinition of Normalization Normalization is a scientific method of breaking down complex table structures into simple table structures by using certain rules. It allows you to reduce redundancy in a table and eliminate the problems of inconsistency and disk space usage. Normalization results in the formation of tables that satisfy certain specified rules and represent certain normal forms. The most important and widely used normal forms are: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Ver. 1.0 Session 3 Slide 3 of 15
4. 4. Relational Database Management SystemsFirst Normal Form (1 NF) A table is said to be in the 1NF when each cell of the table contains precisely one value. Functional Dependency: The normalization theory is based on the fundamental notion of functional dependency. Given a relation R, attribute A is functionally dependent on attribute B if each value of A in R is associated with precisely one value of B. Ver. 1.0 Session 3 Slide 4 of 15
5. 5. Relational Database Management SystemsSecond Normal Form (2 NF) A table is said to be in 2NF when it is in 1NF and every attribute in the row is functionally dependent upon the whole key, and not just part of the key. To ensure that a table is in 2NF, you should: Find and remove attributes that are functionally dependent on only a part of the key and not on the whole key and place them in a different table. Group the remaining attributes. Ver. 1.0 Session 3 Slide 5 of 15
6. 6. Relational Database Management SystemsThird Normal Form (3 NF) A relation is said to be in 3NF when it is in 2NF and every non-key attribute is functionally dependent only on the primary key. To ensure that a table is in 3NF, you should: Find and remove non-key attributes that are functionally dependent on attributes that are not the primary key and place them in a different table. Group the remaining attributes. Ver. 1.0 Session 3 Slide 6 of 15
7. 7. Relational Database Management SystemsBoyce-Codd Normal Form (BCNF) The original definition of 3NF was inadequate in some situations. It was not satisfactory for the tables: that had multiple candidate keys where the multiple candidate keys were composite where the multiple candidate keys overlapped Therefore, a new normal form-the BCNF was introduced: A relation is in the BCNF if and only if every determinant is a candidate key. Ver. 1.0 Session 3 Slide 7 of 15
8. 8. Relational Database Management SystemsBoyce-Codd Normal Form (BCNF) (Contd.) In order to ensure that a table is in BCNF, you should: Find and remove the overlapping candidate keys. Place the part of the candidate key and the attribute it is functionally dependent on, in a different table. Group the remaining items into a table. Ver. 1.0 Session 3 Slide 8 of 15
9. 9. Relational Database Management SystemsJust a minute Which of the following help in achieving a good database design? 1. A table should store data for all the related entities together. 2. Each table should have an identifier. 3. Columns that contain NULL values should be created. Answer: 2. Each table should have an identifier. Ver. 1.0 Session 3 Slide 9 of 15
10. 10. Relational Database Management SystemsJust a minute For which normal form you need to remove non-key attributes that are functionally dependent on attributes that are not primary key? Solution: Third normal form Ver. 1.0 Session 3 Slide 10 of 15
11. 11. Relational Database Management SystemsUnderstanding Denormalization The end product of normalization is a set of related tables that comprise the database. However, in the interests of speed of response to critical queries, which demand information from more than one table, it is sometimes wiser to introduce a degree of redundancy in tables. The intentional introduction of redundancy in a table to improve performance is called denormalization. Ver. 1.0 Session 3 Slide 11 of 15
12. 12. Relational Database Management SystemsJust a minute In the reporting system, the total amount paid to a contract recruiter is often required. The required result can be calculated using the two tables, ContractRecruiter and Payment. ContractRecruiter Payment cContractRecruiterCode cSourceCode cName mAmount vAddress cChequeNo cCity dDate siPercentageCharge Since the size of the Payment table is large, the join between the two tables takes time to generate the required output. It is, therefore, necessary to improve the performance of this query. Ver. 1.0 Session 3 Slide 12 of 15
13. 13. Relational Database Management SystemsJust a minute Answer: Denormalize the tables. Add a column called mTotalPaid to the ContractRecruiter table. ContractRecruiter cContractRecruiterCode cName vAddress cCity siPercentageCharge mTotalPaid Ver. 1.0 Session 3 Slide 13 of 15
14. 14. Relational Database Management SystemsSummary In this session, you learned that: Normalization is used to simplify table structures. Normalization results in the formation of tables that satisfy certain specified constraints, and represent certain normal forms. The normal forms are used to ensure that various types of anomalies and inconsistencies are not introduced in the database. A table structure is always in a certain normal form. Several normal forms have been identified. The most important and widely used of these are: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce Codd Normal Form (BCNF) Ver. 1.0 Session 3 Slide 14 of 15
15. 15. Relational Database Management SystemsSummary (Contd.) The intentional introduction of redundancy in a table in order to improve performance is called denormalization. The decision to denormalize results in a trade-off between performance and data integrity. Denormalization increases disk space utilization. Ver. 1.0 Session 3 Slide 15 of 15