Relational Database Management System




surendersingh@rediffmail.com
Surender Singh
                      Sr. Programmer
              surendersingh@titsbhiwani.ac.in

surendersingh@rediffmai...
Relational Database Management System


                 DATA




               DATABASE




             DBMS/RDBMS




...
File Processing System




surendersingh@rediffmail.com
File Processing System




Application
 Programs
                     File System
(Programs                               ...
File System

    Database




surendersingh@rediffmail.com
Disadvantages of FPS

 Data  Redundancy and Inconsistency
 Difficulty in accessing data
 Data isolation
 Integrity Pro...
Data Redundancy and Inconsistency



  Name   Address          AccNo Name   Address
  ABC    Bhiwani          1002 ABC    ...
Difficulty in accessing data

 Manager



Requirement


Application
 Programs
                         File System
(Progra...
Data Isolation and Integrity Problems
  Program in C                            Program in COBOL

#include <stdio.h>
     ...
Atomicity Problems




                   Bank
             Data Transmission




 surendersingh@rediffmail.com
USER      ...
Concurrent-access anomalies




surendersingh@rediffmail.com
Security Problems




                           Employee
                           Information




surendersingh@rediffm...
Database the Piece of mind


surendersingh@rediffmail.com
Requirements of a DBMS

 • A mechanism for specification of data and its dependencies
   (Integrity Constraints) in an int...
A DBMS has two major components, namely

    Structure of Database is called Database Schema.
      Instance, which is a ...
View of DATA

             View Level (External Level)


View 1    View 2                           View n




           ...
Data Independence
The ability to modify a schema definition in one level without affecting a
schema definition in the next...
Data Models
A Data Model is a mechanism for describing the data, their interrelationships
and the constraints.

    Object...
The E-R Model

Entities : An entity is a distinct clearly identifiable object of the database e.g Book
  Attribute : Each ...
The Relational Model
    Relational Model uses a collection of tables to represent both data and
    relationship among th...
Network Model
   Data in the network model are represented by collection of records and
   relationships among data are re...
Hierarchical Model
 This is special kind of a network model where the relationship is
 essentially a tree-like structure.
...
Physical Data Models


 Physical data models are used to describe data at the lowest level.
 In contrast to logical data m...
Database Languages


                       Database Languages
     Data-Definition       Data-Manipulation      Data-Cont...
Database Management System Structure
Naïve Users                Application                  Sophisticated                ...
surendersingh@rediffmail.com
Oracle Storage System Structure




surendersingh@rediffmail.com
Database Administrator

   Roles of DBA

       • Schema Definition
       • Storage structure and access-method definitio...
Terms
       Simple and Composite Attributes
       Single-valued and Multivalued Attributes
       Null Attributes
      ...
Weak Entity Set




surendersingh@rediffmail.com
Attributes




surendersingh@rediffmail.com
Keys



                            Keys


          Candidate Key     Secondary Key   Foreign Key


  Primary Key   Alter...
Candidate Keys

Primary               Alternate Keys
key
Roll_No      Name           Branch        City

01           Deep...
Primary           Secondary Key
 Key
Roll_No   Name        Branch        City
01        Deepak      Computers     Bhiwani
...
Composite Primary Key

 Name      Branch        City

 Deepak    Computers     Bhiwani

 Mukesh    Electronics   Rohtak

 ...
P#
  Part   P_Name   Colour   Quantity
  P1     Nut      Red      200
  P2     Bolt     Green    250
  P3     Screw    Blu...
S#
 Supplier S_Name City      Quantity
 S1      John    Delhi     200
 S2      Smith   Kolkata   250
 S3      James   Delh...
SP#
  P#     S#      Quantity
  P1     S1      200
  P2     S1      300
  P3     S1      400
  P1     S2      250
  P2    ...
Mapping Cardinalities

 Mapping cardinalities, or cardinality ratios, express the number of entities
 to which another ent...
A      B         A       B




surendersingh@rediffmail.com
    Many to One     Many to Many
More on E-R Diagrams
                     Company


  Owns      Multiple Relationship between   Leased
                   ...
Ternary E-R Diagram


  Instructors            Teaches              Students




                         Courses




    ...
E-R Diagram Components
          Entity Sets

                          Attributes


                                     ...
Existence Dependencies




surendersingh@rediffmail.com
Generalization and Specialization




surendersingh@rediffmail.com
Generalization and Specialization
                           The abstraction mechanisms

                  Emp_No         ...
Aggregation
       The Process of compiling information on an object

                            Teaches



             ...
Represent ER model using tables




surendersingh@rediffmail.com
Query Languages

A query language is a language in which a user requests information from a database.
These are typically ...
The Relational Algebra
   The relational algebra is a procedural query language.




                The Borrow and Branch...
Fundamental Operations

select (unary)
project (unary)
rename (unary)
cartesian product (binary)
union (binary)
set-differ...
Formal Definition of Relational Algebra




surendersingh@rediffmail.com
The Select Operation




surendersingh@rediffmail.com
The Project Operation




surendersingh@rediffmail.com
The Cartesian Product Operation




surendersingh@rediffmail.com
Output of Cartesian Product

     Relation A   Relation B       AXB

      A             B          A    B
      1        ...
The Rename Operation




surendersingh@rediffmail.com
The Union Operation




surendersingh@rediffmail.com
The Set Difference Operation




surendersingh@rediffmail.com
Additional Operations


       The Set Intersection Operation




surendersingh@rediffmail.com
The Natural Join Operation




surendersingh@rediffmail.com
The Division Operation




surendersingh@rediffmail.com
Example of Division Operation

Relation R         Relation S   ÷S
                                R

A   B          B     ...
The Assignment Operation




surendersingh@rediffmail.com
Relational Calculus

Relational Calculus is a nonprocedural Query language
   Tuple Relational Calculus
            Uses ...
Tuple Relational Calculus




surendersingh@rediffmail.com
Example Queries




surendersingh@rediffmail.com
Some More Examples




surendersingh@rediffmail.com
Domain Relational Calculus




surendersingh@rediffmail.com
SQL




surendersingh@rediffmail.com
Integrity Constraints

Integrity and Consistency is of primary concern to any database design
At any instance a database m...
Domain Constraints
 Includes

    Type
    Width
    Null or Not Null
    Checks/Conditions
             Specify at the ti...
Referential Integrity
 Foreign Key
    Referential integrity states that all values of the foreign key of one
    Relation...
Assertions and Triggers
  An assertion is a general predicate, expressed in relational algebra
  Or calculus or any langua...
Functional Dependencies
  Functional Dependencies provide a formal mechanism to express
  Constraints between attributes.
...
Formal Notation of FD
        In general if there are two attributes A and B and the FD

                                 ...
Closure of a Set of Functional
           Dependencies




surendersingh@rediffmail.com
Armstrong’s Axioms




surendersingh@rediffmail.com
Closure of a Set of F+




surendersingh@rediffmail.com
Closure of Attribute Sets




surendersingh@rediffmail.com
Canonical Cover

 To minimize the number of functional dependencies that need to be
 Tested in case of an update we may re...
Example of Cannonical Cover
       Consider a relation r ( X, Y, Z ) with the FDs F.

                              1.   X...
Relational Database Design




surendersingh@rediffmail.com
Database Decomposition – 1

  Representation of Information




surendersingh@rediffmail.com
Database Decomposition – 2




surendersingh@rediffmail.com
Database Decomposition – 3




surendersingh@rediffmail.com
Database Decomposition – 4




surendersingh@rediffmail.com
Lossless-join Decomposition




surendersingh@rediffmail.com
Example of lossy decomposition
                      S_by
                     s_name        s_addr        Item    Price
 ...
Dependency Preservation




surendersingh@rediffmail.com
Normalization

Normalization is a process of removing redundancy using functional Dependencies.

To reduce redundancy it i...
First Normal Form (1NF)

       This normal form says that all attributes are simple.

   An attribute is said to be simpl...
Second Normal Form (2NF)
A relation is said to be in 2NF if it is in 1NF and
All non-prime attributes are fully functional...
Solution
  We can remove this redundancy by splitting the original relation into following two relations

                ...
Third Normal Form (3NF)
 A relation is said to be in 3NF and non-prime attributes are not dependent
 On each other.

 Cons...
Solution
  We decompose the relation
           s_by (s_name, item, price, gift_item )
  Into
           s_by_1 (s_name, i...
Boyce-Codd Normal Form (BCNF)




surendersingh@rediffmail.com
More on BCNF




surendersingh@rediffmail.com
Comparison of BCNF and 3NF




surendersingh@rediffmail.com
Comparison of BCNF and 3NF - 2




surendersingh@rediffmail.com
Normalization using Multivalued
           Dependencies




surendersingh@rediffmail.com
Multivalued Dependencies -2




surendersingh@rediffmail.com
Rules




surendersingh@rediffmail.com
More Rules




surendersingh@rediffmail.com
Fourth Normal Form (4NF)




surendersingh@rediffmail.com
Example




surendersingh@rediffmail.com
Normalization using Join Dependencies
Let R be a relation schema and R1, R2,….Rn be a decomposition of R. The join depende...
Fifth Normal Form (5NF)
                    Project-Join Normal Form
 Project-join normal form (PJNF) is defined in a mann...
Storage and File Structure
          Hierarchy of Storage




surendersingh@rediffmail.com
Description




surendersingh@rediffmail.com
Description - 2




surendersingh@rediffmail.com
File Organization




surendersingh@rediffmail.com
Fixed Length Record -1




surendersingh@rediffmail.com
Fixed Length Record -2




surendersingh@rediffmail.com
Variable-length Records




surendersingh@rediffmail.com
Fixed-length representation




surendersingh@rediffmail.com
Organization of Records in files




surendersingh@rediffmail.com
Concurrency Control and Recovery
Transactions
   Concurrent execution of user programs is essential for good DBMS performance.
      Because disk accesse...
States of Transactions


             Partially       Committed



Active


                                  Aborted
    ...
Concurrency in a DBMS
   Users submit transactions, and can think of each transaction as executing by itself.
      Conc...
Example

   Consider two transactions (Xacts):



     T1:         BEGIN A=A+100, B=B-100 END
     T2:         BEGIN A=1....
Example (Contd.)

 Consider     a possible interleaving (schedule):
     T1:     A=A+100,                    B=B-100
    ...
Example (Contd.)

   The DBMS must not allow schedules like this!



      T1:          R(A), W(A),                      ...
Scheduling Transactions

   Equivalent schedules: For any database state, the effect (on the set of objects in the databa...
Detection of Serializability
 One of the techniques of concurrency control is to detect whether a schedule is valid or not...
Serializable Concurrency
           T1                                                                    T2

           R...
Deadlock Condition

   T1                              T2

   UPDATE account                  UPDATE account
   SET balanc...
Lock-Based Techniques
 In this technique the system does not participate in detection of inconsistency nor does it take an...
Example
  T1                     T2

  Lock-X(P)
  Read (P,p)
  P=p-1
  Write(P,p)
  Unlock(P)
                         Lo...
Two-Phase locking
  Phase I – Acquiring Phase : During this phase a transaction may lock a data item but not
             ...
Enforcing (Conflict) Serializability

   Two-phase Locking (2PL) Protocol:
      Each Xact must obtain a S (shared) lock...
Atomicity of Transactions
   A transaction might commit after completing all its actions, or it could abort (or be aborte...
Aborting a Transaction
   If a transaction Ti is aborted, all its actions have to be undone. Not only that, if Tj reads a...
The Log

   The following actions are recorded in the log:
      Ti writes an object: the old value and the new value.

...
The Log - 2
                               Log file e.g. X=1000, Y= 2000
  T:
       Read (X, xi)                   Transa...
Checkpoints
 At the time of recovery the entire log needs to be searched to know which transaction need to
 Be redone and ...
Recovering From a Crash
     There are 3 phases in the Aries recovery algorithm:
        Analysis: Scan the log forward ...
Summary

   Concurrency control and recovery are among the most important functions provided by a DBMS.
   Users need no...
Query Processing/Optimization




surendersingh@rediffmail.com
Rules

 Optimization using algebraic Manipulation
 Any algebraic manipulation approach to query optimization uses a set of...
Example




surendersingh@rediffmail.com
Example contd.




surendersingh@rediffmail.com
Projection Operation




surendersingh@rediffmail.com
Natural Join Operation




surendersingh@rediffmail.com
Natural Join Operation - 2




surendersingh@rediffmail.com
Upcoming SlideShare
Loading in...5
×

Relational Database Management System

21,931

Published on

Presentation is about the Basic RDBMS Concepts

Published in: Education, Technology
1 Comment
13 Likes
Statistics
Notes
No Downloads
Views
Total Views
21,931
On Slideshare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
1,726
Comments
1
Likes
13
Embeds 0
No embeds

No notes for slide

Relational Database Management System

  1. 1. Relational Database Management System surendersingh@rediffmail.com
  2. 2. Surender Singh Sr. Programmer surendersingh@titsbhiwani.ac.in surendersingh@rediffmail.com
  3. 3. Relational Database Management System DATA DATABASE DBMS/RDBMS Information surendersingh@rediffmail.com
  4. 4. File Processing System surendersingh@rediffmail.com
  5. 5. File Processing System Application Programs File System (Programs Database Written in C (Data Structure (Information in Pascal etc.) File Handling) Files Format) surendersingh@rediffmail.com
  6. 6. File System Database surendersingh@rediffmail.com
  7. 7. Disadvantages of FPS  Data Redundancy and Inconsistency  Difficulty in accessing data  Data isolation  Integrity Problems  Atomicity Problems  Concurrent-access anomalies  Security Problems surendersingh@rediffmail.com
  8. 8. Data Redundancy and Inconsistency Name Address AccNo Name Address ABC Bhiwani 1002 ABC Bhiwani DEF Delhi 1005 DEF Jaipur Customer Information Saving Account surendersingh@rediffmail.com
  9. 9. Difficulty in accessing data Manager Requirement Application Programs File System (Programs Database Written in C (Data Structure (Information Storage Pascal etc.) File Handling) in Files Format) surendersingh@rediffmail.com
  10. 10. Data Isolation and Integrity Problems Program in C Program in COBOL #include <stdio.h> 01 Reserve-rec. Main() 03 saving { 05 accno PIC A(2) ----- -------- } New Document surendersingh@rediffmail.com
  11. 11. Atomicity Problems Bank Data Transmission surendersingh@rediffmail.com USER USER
  12. 12. Concurrent-access anomalies surendersingh@rediffmail.com
  13. 13. Security Problems Employee Information surendersingh@rediffmail.com
  14. 14. Database the Piece of mind surendersingh@rediffmail.com
  15. 15. Requirements of a DBMS • A mechanism for specification of data and its dependencies (Integrity Constraints) in an integrated fashion. • Prevention of redundancy and inconsistency. • Provision of adequate security and access-rights. • Mechanism for concurrency control. • Mechanism for recovery from failure. Additionally any DBMS must provide • Schemes for specification of procession rules or application Programs. • Efficient techniques for storage and retrieval of data from the secondary storage (disk). surendersingh@rediffmail.com
  16. 16. A DBMS has two major components, namely  Structure of Database is called Database Schema. Instance, which is a state of the database with the actual data loaded.  A set of software tools/programs which access, update and process the database, called the query and update-mechanism. D B File Secondary M Manager Storage S surendersingh@rediffmail.com
  17. 17. View of DATA View Level (External Level) View 1 View 2 View n Logical Level Conceptual View Physical Level Internal View surendersingh@rediffmail.com
  18. 18. Data Independence The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence.  Physical data independence  Logical data independence Create table emp (empno number(10), -------------- ); surendersingh@rediffmail.com
  19. 19. Data Models A Data Model is a mechanism for describing the data, their interrelationships and the constraints. Object-based Conceptual models. Entity-Relationship model Record-based models. Relational Model Network Model Hierarchical Model Physical data models. surendersingh@rediffmail.com
  20. 20. The E-R Model Entities : An entity is a distinct clearly identifiable object of the database e.g Book Attribute : Each Entity is characterized by a set of attributes e.g. Acc.No. Entity Set : Set of all entities having attributes of the same type. Relationships : A relationship is a mapping between entity sets. Acc_No Card_No Name Acc_No Title BOOK Borrowed_By USERS surendersingh@rediffmail.com Author YearofPub Card_No DOI Address
  21. 21. The Relational Model Relational Model uses a collection of tables to represent both data and relationship among those data. Each table has multiple Attributes and similar kind of tuples. Attribute Book Table/Relation AccNo Title Author YearofPub Tuple surendersingh@rediffmail.com
  22. 22. Network Model Data in the network model are represented by collection of records and relationships among data are represented by links, which can be viewed as Pointers. User Card_No Name Address Link Pointer Next Book Acc_No Author ----- Link surendersingh@rediffmail.com
  23. 23. Hierarchical Model This is special kind of a network model where the relationship is essentially a tree-like structure. Hospital Wards Units Patient Doctors Nurses Cardiology Skin surendersingh@rediffmail.com
  24. 24. Physical Data Models Physical data models are used to describe data at the lowest level. In contrast to logical data models, there are few physical data models In use. Two of the widely known ones are the Unifing model and frame-Memory model. surendersingh@rediffmail.com
  25. 25. Database Languages Database Languages Data-Definition Data-Manipulation Data-Control Create Table Test ( Update Title Varchar2(20), Insert GRANT Connect, -------- Delete Resource TO xUser ); Query surendersingh@rediffmail.com
  26. 26. Database Management System Structure Naïve Users Application Sophisticated Database Users (tellers, agents, etc.) Programmers Users Administrators Application Application Database Interfaces Programs Query Scheme Embedded DML DDL Application DML Compiler Interpreter Programs Precompiler Object Code Query Query Processor Evaluation Engine Database Management System Transaction Buffer Manager Storage Manager Manager File Manager Indices Statistical Data Disk Storage surendersingh@rediffmail.com Data Files Data Dictionary
  27. 27. surendersingh@rediffmail.com
  28. 28. Oracle Storage System Structure surendersingh@rediffmail.com
  29. 29. Database Administrator Roles of DBA • Schema Definition • Storage structure and access-method definition • Schema and Physical-organization modification • Granting of authorization for data access • Integrity-constraint specification surendersingh@rediffmail.com
  30. 30. Terms Simple and Composite Attributes Single-valued and Multivalued Attributes Null Attributes Derived Attributes Existence Dependencies Weak Entity Set and Strong Entity Set surendersingh@rediffmail.com
  31. 31. Weak Entity Set surendersingh@rediffmail.com
  32. 32. Attributes surendersingh@rediffmail.com
  33. 33. Keys Keys Candidate Key Secondary Key Foreign Key Primary Key Alternate Key Composite Key surendersingh@rediffmail.com
  34. 34. Candidate Keys Primary Alternate Keys key Roll_No Name Branch City 01 Deepak Computers Bhiwani 02 Mukesh Electronics Rohtak 03 Teena Mechanical Bhiwani 04 Deepti Chemical Rohtak 05 Monika Civil Delhi surendersingh@rediffmail.com
  35. 35. Primary Secondary Key Key Roll_No Name Branch City 01 Deepak Computers Bhiwani 02 Mukesh Electronics Rohtak 03 Teena Computers Bhiwani 04 Deepak Electronics Rohtak 05 Monika Computers Delhi surendersingh@rediffmail.com
  36. 36. Composite Primary Key Name Branch City Deepak Computers Bhiwani Mukesh Electronics Rohtak Teena Computers Bhiwani Deepak Electronics Rohtak Monika Computers Delhi surendersingh@rediffmail.com
  37. 37. P# Part P_Name Colour Quantity P1 Nut Red 200 P2 Bolt Green 250 P3 Screw Blue 300 surendersingh@rediffmail.com
  38. 38. S# Supplier S_Name City Quantity S1 John Delhi 200 S2 Smith Kolkata 250 S3 James Delhi 300 S4 David Chennai 400 S5 John Chennai 300 surendersingh@rediffmail.com
  39. 39. SP# P# S# Quantity P1 S1 200 P2 S1 300 P3 S1 400 P1 S2 250 P2 S3 250 P3 S4 200 P2 S4 300 P3 S5 400 surendersingh@rediffmail.com
  40. 40. Mapping Cardinalities Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via a relationship set. For a binary relationship set R between entity sets A and B, the mapping Cardinality must be one of the following A B A B One to One One to Many surendersingh@rediffmail.com
  41. 41. A B A B surendersingh@rediffmail.com Many to One Many to Many
  42. 42. More on E-R Diagrams Company Owns Multiple Relationship between Leased Same entity set Vehicle Manager Staff Reports to Subordinate Circular Relationship surendersingh@rediffmail.com
  43. 43. Ternary E-R Diagram Instructors Teaches Students Courses Book Borrowed_By User N 1 Constraints surendersingh@rediffmail.com
  44. 44. E-R Diagram Components Entity Sets Attributes Relationship Sets Connectors/Constraints Multivalued Attributes Derived Attributes Total Participation of an entity in a relationship set surendersingh@rediffmail.com
  45. 45. Existence Dependencies surendersingh@rediffmail.com
  46. 46. Generalization and Specialization surendersingh@rediffmail.com
  47. 47. Generalization and Specialization The abstraction mechanisms Emp_No Name Date_of_hire Generalization Employee Specialization IS_A IS_A Full_time Part_time Type Employee Salary Employee IS_A IS_A IS_A IS_A Faculty Staff Teaching Casual surendersingh@rediffmail.com Degree Interest Stipend Hour_Rate
  48. 48. Aggregation The Process of compiling information on an object Teaches Teacher Uses Course Book Teacher-Teaches Teacher Teaches Course Uses surendersingh@rediffmail.com Book
  49. 49. Represent ER model using tables surendersingh@rediffmail.com
  50. 50. Query Languages A query language is a language in which a user requests information from a database. These are typically higher-level than programming languages. They may be one of: Procedural, where the user instructs the system to perform a sequence of operations on the database. This will compute the desired information. Nonprocedural, where the user species the information desired without giving a procedure for ob-taining the information. A complete query language also contains facilities to insert and delete tuples as well as to modify parts of existing tuples. surendersingh@rediffmail.com
  51. 51. The Relational Algebra The relational algebra is a procedural query language. The Borrow and Branch relations surendersingh@rediffmail.com
  52. 52. Fundamental Operations select (unary) project (unary) rename (unary) cartesian product (binary) union (binary) set-difference (binary) Several other operations, dened in terms of the fundamental operations: set-intersection natural join division assignment Operations produce a new relation as a result. surendersingh@rediffmail.com
  53. 53. Formal Definition of Relational Algebra surendersingh@rediffmail.com
  54. 54. The Select Operation surendersingh@rediffmail.com
  55. 55. The Project Operation surendersingh@rediffmail.com
  56. 56. The Cartesian Product Operation surendersingh@rediffmail.com
  57. 57. Output of Cartesian Product Relation A Relation B AXB A B A B 1 1 X X 2 1 Y Y 2 X 3 2 Y 3 X 3 Y surendersingh@rediffmail.com
  58. 58. The Rename Operation surendersingh@rediffmail.com
  59. 59. The Union Operation surendersingh@rediffmail.com
  60. 60. The Set Difference Operation surendersingh@rediffmail.com
  61. 61. Additional Operations The Set Intersection Operation surendersingh@rediffmail.com
  62. 62. The Natural Join Operation surendersingh@rediffmail.com
  63. 63. The Division Operation surendersingh@rediffmail.com
  64. 64. Example of Division Operation Relation R Relation S ÷S R A B B A P A A P Q A P B B Q Q T M A Q B surendersingh@rediffmail.com
  65. 65. The Assignment Operation surendersingh@rediffmail.com
  66. 66. Relational Calculus Relational Calculus is a nonprocedural Query language  Tuple Relational Calculus Uses Tuple variables which take values of an entire tuple  Domain Relational Calculus Uses Domain variables which takes values from an attribute surendersingh@rediffmail.com
  67. 67. Tuple Relational Calculus surendersingh@rediffmail.com
  68. 68. Example Queries surendersingh@rediffmail.com
  69. 69. Some More Examples surendersingh@rediffmail.com
  70. 70. Domain Relational Calculus surendersingh@rediffmail.com
  71. 71. SQL surendersingh@rediffmail.com
  72. 72. Integrity Constraints Integrity and Consistency is of primary concern to any database design At any instance a database must be correct according to a set of rules. Rules are checked during any database operation. Insertion Deletion Updation Recovery from Failure Concurrent Operations Types of Constraints Domain Constraints Referential Integrity Constraint Functional Dependencies surendersingh@rediffmail.com
  73. 73. Domain Constraints Includes Type Width Null or Not Null Checks/Conditions Specify at the time of designing Checked at the time of insertion, deletion or modification e.g Bname char(20) Amount number(7,2) DOL date check (date>=29/09/2004 City char(10) not null TotalAmt = amount + interest surendersingh@rediffmail.com
  74. 74. Referential Integrity Foreign Key Referential integrity states that all values of the foreign key of one Relation must be present in another relation where the same attribute Is declared as the primary key Checks during Database Modification Insert Delete Update surendersingh@rediffmail.com
  75. 75. Assertions and Triggers An assertion is a general predicate, expressed in relational algebra Or calculus or any language like SQL which must always hold in a Database Assert salary-constraint on emp salary >= 1000 A trigger is a statement or a block of statements which are executed Automatically by the system when an event (i.e., insertion, updation Or deletion) takes place on a table Define trigger insert_record on delete of emp e (insert into emp_history values e.empno, e.name, e.deptno) surendersingh@rediffmail.com
  76. 76. Functional Dependencies Functional Dependencies provide a formal mechanism to express Constraints between attributes. It is a mean of identifying how values of certain attributes are Determined by values of other attributes. A functional dependency (FD) generalizes the concept of a key. Book (acc_no, yr_pub, title) Acc_no is Primary Key Formal representation of Constraints acc_no yr_pub acc_no title surendersingh@rediffmail.com
  77. 77. Formal Notation of FD In general if there are two attributes A and B and the FD A B Holds then, it means that there can be no two tuple which have The same value of attributes A and different values in attribute B. If α and β are two sets of attributes then the FD α β holds on a Relation r(R), if – 1. α , β ⊆ R, i.e. α , β subset of R 2. for all tuples t1 and t2 in r, if t1 [α ] = t2 [α ] then t1 [β ] = t2 [β ] surendersingh@rediffmail.com
  78. 78. Closure of a Set of Functional Dependencies surendersingh@rediffmail.com
  79. 79. Armstrong’s Axioms surendersingh@rediffmail.com
  80. 80. Closure of a Set of F+ surendersingh@rediffmail.com
  81. 81. Closure of Attribute Sets surendersingh@rediffmail.com
  82. 82. Canonical Cover To minimize the number of functional dependencies that need to be Tested in case of an update we may restrict F to a canonical cover Fc . A canonical cover for F is a set of dependencies such that F logically Implies all dependencies in Fc. A canonical cover Fc of a set of FDs F is a minimal cover of F in the Sense that there is no subset of Fc which also covers F. surendersingh@rediffmail.com
  83. 83. Example of Cannonical Cover Consider a relation r ( X, Y, Z ) with the FDs F. 1. X YZ 2. Y Z 3. X Y 4. XY Z Here 4 is redundant because (1) states that X Y and X Z holds. Thus (4) can be derived from (1). Also (3) is redundant because (1) contains (3). Deleting these two we get 1. X YZ 2. Y Z Which is a cover of F. Here again since X Y and Y Z holds, by Transitivity X Z holds. So it is redundant. Deleting this we get the FDs as X Y Y Z Which is a cannonical cover of F. surendersingh@rediffmail.com
  84. 84. Relational Database Design surendersingh@rediffmail.com
  85. 85. Database Decomposition – 1 Representation of Information surendersingh@rediffmail.com
  86. 86. Database Decomposition – 2 surendersingh@rediffmail.com
  87. 87. Database Decomposition – 3 surendersingh@rediffmail.com
  88. 88. Database Decomposition – 4 surendersingh@rediffmail.com
  89. 89. Lossless-join Decomposition surendersingh@rediffmail.com
  90. 90. Example of lossy decomposition S_by s_name s_addr Item Price A1 B1 C1 D1 A1 B1 C2 D1 p1 A2 B2 C1 D2 p2 S_addr Item price S_name Item A2 B2 C3 D3 B1 C1 D1 A1 C1 A3 B1 C2 D2 B1 C2 D1 A1 C2 A2 C1 Natural Join of P1 and p2 B2 C1 D2 S_name S_addr Item Price B2 C3 D3 A2 C3 A1 B1 C1 D1 B1 C2 D2 A3 C2 A1 B2 C1 D2 A1 B1 C2 D1 A1 B1 C2 D2 A2 B1 C1 D1 A2 B2 C1 D2 A2 B2 C3 D3 A3 B1 C2 D1 surendersingh@rediffmail.com A3 B1 C2 D1
  91. 91. Dependency Preservation surendersingh@rediffmail.com
  92. 92. Normalization Normalization is a process of removing redundancy using functional Dependencies. To reduce redundancy it is necessary to decompose a relation into a number of smaller relations. There are several normal Forms. -First Normal Form (1 NF) -Second Normal Form (2 NF) -Third Normal Form(3 NF) -Boyce-Codd Normal Form (BCNF) surendersingh@rediffmail.com
  93. 93. First Normal Form (1NF) This normal form says that all attributes are simple. An attribute is said to be simple if it does not contain any subparts. An attributes which contains subparts is called complex attributes. Name C_addr F_name L_name City State Zip surendersingh@rediffmail.com
  94. 94. Second Normal Form (2NF) A relation is said to be in 2NF if it is in 1NF and All non-prime attributes are fully functionally dependent on candidate key Consider a relation savings_deposit having the following structure:- Saving_deposit (name, addr, acc_no, amt ) With the following FDs : name addr name, acc_no amt Here [name, acc_no ] is the candidate key and addr and amt are the non prime attributes. Among the non-prime attributes amt depends on [name, acc_no ] whereas addr depends on name only. Note that due to FD name addr every tuple with the same name will contain the same Address causing redundancy. This redundancy arises because a non-prime attribute like address is dependent on an attribute surendersingh@rediffmail.com Which is not a candidate key.
  95. 95. Solution We can remove this redundancy by splitting the original relation into following two relations Sav_sch1 (name, addr) Sav_sch2(name, acc_no,amt) Both the relations are now 2NF. In the first relation name is Primary Key and the onlyNon-prime attribute is addr which is dependent on name In the second relation the only non-prime attribute amt depend on both name and Acc_no. that this decomposition is also lossless join and dependency preserving Courses ( Course_no, title, loc, time ) And FD’s are – Course_no title Course_no, time loc surendersingh@rediffmail.com
  96. 96. Third Normal Form (3NF) A relation is said to be in 3NF and non-prime attributes are not dependent On each other. Consider the relation – s_by ( s_name, item, price, gift_item ) With FDs s_name, item price price gift_item Here all prime attributes are fully functional dependent on candidate keys, the Non-prime attribute gift-item is also fully functional dependent on the non-prime Attribute price. This create redundancy because every price value there is a fixed Gift item. We shall have to impose the additional restriction that no non-prime attribute can Be functionally dependent on another non-prime attributes. surendersingh@rediffmail.com
  97. 97. Solution We decompose the relation s_by (s_name, item, price, gift_item ) Into s_by_1 (s_name, item, price ) s_by_2 (price, gift_item) Now we have a lossless join and dependency preserving decomposition. An alternative yet equivalent definition for 3NF is : For every FD α β on R at least one of the following conditions hold – •α ⊆ β (trivial dependency) •α R (α is a super key ) surendersingh@rediffmail.com
  98. 98. Boyce-Codd Normal Form (BCNF) surendersingh@rediffmail.com
  99. 99. More on BCNF surendersingh@rediffmail.com
  100. 100. Comparison of BCNF and 3NF surendersingh@rediffmail.com
  101. 101. Comparison of BCNF and 3NF - 2 surendersingh@rediffmail.com
  102. 102. Normalization using Multivalued Dependencies surendersingh@rediffmail.com
  103. 103. Multivalued Dependencies -2 surendersingh@rediffmail.com
  104. 104. Rules surendersingh@rediffmail.com
  105. 105. More Rules surendersingh@rediffmail.com
  106. 106. Fourth Normal Form (4NF) surendersingh@rediffmail.com
  107. 107. Example surendersingh@rediffmail.com
  108. 108. Normalization using Join Dependencies Let R be a relation schema and R1, R2,….Rn be a decomposition of R. The join dependency *(R1, R2,….Rn) is used to restrict the set of legal relations to those for which R1, R2,….Rn is A lossless-join decomposition of R. Formally, if R = R1∪ R2 ∪ …… ∪ Rn, we say that a relation r( R ) satisfies the join dependency. surendersingh@rediffmail.com
  109. 109. Fifth Normal Form (5NF) Project-Join Normal Form Project-join normal form (PJNF) is defined in a manner similar to BCNF and 4NF, Except that join dependencies are used. A relation schema R is in PJNF with respect to a set D of functional multivalued and Join dependencies if, for all join depencdencies in D+ of the form *(R1, R2,…. Rn). Where each Ri ⊆ R and R = R1 ∪ R2 ∪…… ∪ Rn, at least one of the following holds: • *(R1, R2…..Rn) is a trival join dependency. • Every Ri is a superkey for R. It’s seems that every PJNF is also in 4NF Thus, in general, we may not be able to find a dependency-preserving decomposition Into PJNF for a given schema. surendersingh@rediffmail.com
  110. 110. Storage and File Structure Hierarchy of Storage surendersingh@rediffmail.com
  111. 111. Description surendersingh@rediffmail.com
  112. 112. Description - 2 surendersingh@rediffmail.com
  113. 113. File Organization surendersingh@rediffmail.com
  114. 114. Fixed Length Record -1 surendersingh@rediffmail.com
  115. 115. Fixed Length Record -2 surendersingh@rediffmail.com
  116. 116. Variable-length Records surendersingh@rediffmail.com
  117. 117. Fixed-length representation surendersingh@rediffmail.com
  118. 118. Organization of Records in files surendersingh@rediffmail.com
  119. 119. Concurrency Control and Recovery
  120. 120. Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently.  A user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database.  A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes. A Tracnsaction is a unit of program execution That accesses and possibly updates various Data items. Collection of operations that form a single logical unit of work are called tracsactions. A database system must ensure proper execution of transaction despite failures. To ensure integrity of the data, database system must maintain the following properties of the transactions: surendersingh@rediffmail.com
  121. 121. States of Transactions Partially Committed Active Aborted Failed surendersingh@rediffmail.com
  122. 122. Concurrency in a DBMS  Users submit transactions, and can think of each transaction as executing by itself.  Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions.  Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.  DBMS will enforce some ICs, depending on the ICs declared in CREATE TABLE statements.  Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed).  Issues: Effect of interleaving transactions, and crashes. surendersingh@rediffmail.com
  123. 123. Example  Consider two transactions (Xacts): T1: BEGIN A=A+100, B=B-100 END T2: BEGIN A=1.06*A, B=1.06*B END y Intuitively, the first transaction is transferring $100 from B’s account to A’s account. The second is crediting both accounts with a 6% interest payment. y There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order. surendersingh@rediffmail.com
  124. 124. Example (Contd.)  Consider a possible interleaving (schedule): T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B y This is OK. But what about: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B y The DBMS’s view of the second schedule: T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B) surendersingh@rediffmail.com
  125. 125. Example (Contd.)  The DBMS must not allow schedules like this! T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B) A T1 T2 Dependency graph B y Dependency graph: One node per Xact; edge from Ti to Tj if Tj reads or writes an object last written by Ti. y The cycle in the graph reveals the problem. The output of T1 depends on T2, and vice-versa. surendersingh@rediffmail.com
  126. 126. Scheduling Transactions  Equivalent schedules: For any database state, the effect (on the set of objects in the database) of executing the first schedule is identical to the effect of executing the second schedule.  Serializable schedule: A schedule that is equivalent to some serial execution of the transactions.  If the dependency graph of a schedule is acyclic, the schedule is called conflict serializable. Such a schedule is equivalent to a serial schedule.  This is the condition that is typically enforced in a DBMS (although it is not necessary for serializability). surendersingh@rediffmail.com
  127. 127. Detection of Serializability One of the techniques of concurrency control is to detect whether a schedule is valid or not Prior to execution. The task of understanding a schedule is simplified by considering only the sequence of read and write operation in a transaction T1 T2 Read(X) Read(X) Write(X) Write(X) Read(Y) Write(Y) Read(Y) Write(Y) Read-Write sequence of a non-serializable schedule surendersingh@rediffmail.com
  128. 128. Serializable Concurrency T1 T2 Read(X) Write(X) Read(X) Write(X) Read(Y) Write(Y) Read(Y) Write(Y) A serializable concurrent schedule Generalize the idea of conflict. Consider the four possibilities which can arise between two Consecutive instructions T1 and T2 in a schedule ( T1 and T2 belong to two different transactions) 1. T1 : Read(X) followed by T2 : Write(X) 2. T1 : Read(X) followed by T2 : Read(X) 3. T1 : Write(X) followed by T2 : Read(X) 4. T1 : Write(X) followed by T2 : Write(X) T1 and T2 are said to be conflict if they cannot be swapped without fear of loss of consistency. surendersingh@rediffmail.com In above 3 cases all pairs except case 2 are said to be in conflict.
  129. 129. Deadlock Condition T1 T2 UPDATE account UPDATE account SET balance = balance * 0.1 SET balance = balance * 0.1 WHERE acc_no = ‘FC821’ WHERE acc_no = ‘FC523’ UPDATE account UPDATE account SET age = 30 SET age = 38 WHERE acc_no = ‘FC523’ WHERE acc_no = ‘FC821’ surendersingh@rediffmail.com
  130. 130. Lock-Based Techniques In this technique the system does not participate in detection of inconsistency nor does it take any Corrective action. The DBMS however, provides the user with a set of operations which when used properly can ensure that concurrent execution will not violate consistency. In this techniques functions are provided to lock and unlock data items by transactions, In the simplest case a data item X can be locked by a transaction T1 in two modes : Shared Mode : if T1 locks X in shared mode then before T1 unlocks X, no other transaction T2 can write into X. But a transaction T2 can read the value of X even if T1 has locked locked X in shared mode. Exclusive Mode : If T1 locks X in exclusive mode then before T1 unlocks X, no other transaction T2 can read or write into X. surendersingh@rediffmail.com
  131. 131. Example T1 T2 Lock-X(P) Read (P,p) P=p-1 Write(P,p) Unlock(P) Lock-S(Q) Read(Q,q) unlock(Q) Lock-S(P) Read(P,p) unlock(P) display(p) display(p) Lock-X(Q) Read(Q,q) q=q+1 Write(Q,q) Unlock(Q) surendersingh@rediffmail.com
  132. 132. Two-Phase locking Phase I – Acquiring Phase : During this phase a transaction may lock a data item but not unlock any data item. Phase II – Releasing Phase : During this phase a transaction may unlock data items locked earlier but no new locks may be acquired. In two phase locking phase I must always precede phase II. This will ensure that all schedule are automatically conflict serialzable. surendersingh@rediffmail.com
  133. 133. Enforcing (Conflict) Serializability  Two-phase Locking (2PL) Protocol:  Each Xact must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing.  Once an Xact releases any lock, it cannot obtain new locks.  If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object.  2PL allows only conflict-serializable schedules.  Potential problem of deadlocks: we could have a cycle of Xacts, T1, T2, ... , Tn, with each Ti waiting for its predecessor to release some lock that it needs.  Dealt with by killing one of them and releasing its locks. surendersingh@rediffmail.com
  134. 134. Atomicity of Transactions  A transaction might commit after completing all its actions, or it could abort (or be aborted by the DBMS) after executing some actions.  A very important property guaranteed by the DBMS for all transactions is that they are atomic. That is, a user can think of a Xact as always executing all its actions in one step, or not executing any actions at all.  DBMS logs all actions so that it can undo the actions of aborted transactions.  This ensures that if each Xact preserves consistency, every serializable schedule preserves consistency. surendersingh@rediffmail.com
  135. 135. Aborting a Transaction  If a transaction Ti is aborted, all its actions have to be undone. Not only that, if Tj reads an object last written by Ti, Tj must be aborted as well!  Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time.  If Ti writes an object, Tj can read this only after Ti commits.  In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is recorded. This mechanism is also used to recover from system crashes: all active Xacts at the time of the crash are aborted when the system comes back up. surendersingh@rediffmail.com
  136. 136. The Log  The following actions are recorded in the log:  Ti writes an object: the old value and the new value.  Log record must go to disk before the changed page!  Ti commits/aborts: a log record indicating this action.  Log records are chained together by Xact id, so it’s easy to undo a specific Xact.  Log is often duplexed and archived on stable storage.  All log related activities (and in fact, all activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS. surendersingh@rediffmail.com
  137. 137. The Log - 2 Log file e.g. X=1000, Y= 2000 T: Read (X, xi) Transaction Name xi  xi – 500 Data item Name Write (X,xi) Old Value New Value Read ( Y, yi) yi  yi + 500 <T starts> Write (Y, yi) <T, X, 1000, 500> <T, Y, 2000, 2500> <T, commits> surendersingh@rediffmail.com
  138. 138. Checkpoints At the time of recovery the entire log needs to be searched to know which transaction need to Be redone and which transactions needs to be undone. The problem with this approach is: 1. It will take a reasonable amount of time. 2. Most of the transactions that need to be redone have already modified the database. To solve this problem the concept of checkpoint is used here at different points. Checkpoints are introduced to indicate that the data before this point has already been Updated to the database. Before writing checkpoints the following sequence of actions shuld to take place – - Output all log records currently residing in the main store to a stable storage - Output all modified buffer blocks to secondary storage. - Output a log record <checkpoint> surendersingh@rediffmail.com
  139. 139. Recovering From a Crash  There are 3 phases in the Aries recovery algorithm:  Analysis: Scan the log forward (from the most recent checkpoint) to identify all Xacts that were active, and all dirty pages in the buffer pool at the time of the crash.  Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to ensure that all logged updates are in fact carried out and written to disk.  Undo: The writes of all Xacts that were active at the crash are undone (by restoring the before value of the update, which is in the log record for the update), working backwards in the log. (Some care must be taken to handle the case of a crash occurring during the recovery process!) Data can be lost due to the failure of the nonvolatile storage like the disk. The scheme which is available To protect the data from disk failure is to periodically dump the entire contents of the database to any backup (or even stable) storage like a magnetic tape. When a failure occurs the most recent dump is used to restoring The datbase to a previous consistent state. Then the log is used to redo all the transactions that have committed Since the last dump occurred. The following steps are performed for this purpose : • Output all log records currently residing in the main memory onto stable store. • Output all buffer blocks onto the disk. • Copy the contents of the database to stable store. • Output a log record <dump>. surendersingh@rediffmail.com
  140. 140. Summary  Concurrency control and recovery are among the most important functions provided by a DBMS.  Users need not worry about concurrency.  System automatically inserts lock/unlock requests and schedules actions of different Xacts in such a way as to ensure that the resulting execution is equivalent to executing the Xacts one after the other in some order.  Write-ahead logging (WAL) is used to undo the actions of aborted transactions and to restore the system to a consistent state after a crash.  Consistent state: Only the effects of commited Xacts seen. surendersingh@rediffmail.com
  141. 141. Query Processing/Optimization surendersingh@rediffmail.com
  142. 142. Rules Optimization using algebraic Manipulation Any algebraic manipulation approach to query optimization uses a set of rules, which may Be enumerated as follows.  Perform selection as early as possible, in order to reduce the number of tuples to be processed subsequently.  Projections of projections should be combined, if possible, in order to avoid repeated scanning of tuples.  Projection over indexed attributes should be done earlier and That over non-indexed attributes should be done later.  Intermediate relations produced in separate processing sequences must be shared as as and when possible.  If possible, attributes which are controlling a join operation should be sorted earlier. surendersingh@rediffmail.com
  143. 143. Example surendersingh@rediffmail.com
  144. 144. Example contd. surendersingh@rediffmail.com
  145. 145. Projection Operation surendersingh@rediffmail.com
  146. 146. Natural Join Operation surendersingh@rediffmail.com
  147. 147. Natural Join Operation - 2 surendersingh@rediffmail.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×