• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Sql ch 9 - data integrity
 

Sql ch 9 - data integrity

on

  • 4,441 views

SQL

SQL

Statistics

Views

Total Views
4,441
Views on SlideShare
4,441
Embed Views
0

Actions

Likes
2
Downloads
90
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Sql ch 9 - data integrity Sql ch 9 - data integrity Document Transcript

    • SQL – Ch 10 – Data Integrity 10. DATA INTEGRITY1. What does the term data integrity mean? The term data integrity refers to the correctness and completeness of the data in a database. When the contents of a database are modified with the INSERT, DELETE, or UPDATE statements, the integrity of the stored data can be lost in many different ways. For example: • Invalid data may be added to the database, e.g., order for a nonexistent product. • Existing data may be modified to an incorrect value, e.g., reassigning a salesperson to a nonexistent office. • Changes to the database may be lost due to a system error or power failure. • Changes may be partially applied, such as adding an order for a product without adjusting the quantity available for sale. To preserve the consistency and correctness of its data, a RDBMS imposes one or more data integrity constraints. These constraints restrict the data values that can be inserted into the database or created by a database update. The different types of data integrity constraints are: • Required data. • Validity checking. • Entity integrity. • Referential integrity • Other data relationships. • Business rules • Consistency2 Explain the different data integrity constraints The term data integrity refers to the correctness and completeness of the data in a database. When the contents of a database are modified with the INSERT, DELETE, or UPDATE statements, the integrity of the stored data can be lost in many different ways. The different types of data integrity constraints are: Required data. Some columns in a database must contain a valid data value in every row; they are not allowed to contain missing or NULL values. E.g., every order must have an associated customer who placed the order. Therefore, the CUST column in the ORDERS table is a required column. The DBMS can be asked to prevent NULL values in this column. Validity checking. Every column in a database has a domain, or a set of data values that are permitted for that column. E.g., order numbers that begin at 100,001, so the domain of the ORDER_NUM column is positive integers greater than 100,000. Similarly, employee numbers in the EMPL_NUM column must fall within the numeric range of 101 to 999. The DBMS can be asked to prevent other data values in these columns. Entity integrity. The primary key of a table must contain a unique value in each row, which is different from the values in all other rows. E.g., each row of the PRODUCTS table has a unique set of values in its MFR_ID and PRODUCT_ID columns. Duplicate values are illegal. The DBMS can be asked to enforce this unique values constraint. Referential integrity. A foreign key in a relational database links each row in the child table containing the foreign key to the row of the parent table containing the matching primary key value. In the sample database,Prof. Mukesh N. Tekwani [9869 488 356] Page 1
    • SQL - Ch 10 – Data Integrity Other data relationships. Other constraints may be enforced on the database. For example, the quota target for each office must not exceed the total of the quota targets for the salespeople in that office. The DBMS can be asked to check modifications to the office and salesperson quota targets to make sure that their values are constrained in this way. Business rules. Updates to a database may be prevented by business rules governing the real-world transactions that are represented by the updates. E.g., there may be a business rule that forbids accepting an order for which there is an inadequate product inventory. The DBMS can be asked to check each new row added to the ORDERS table to make sure that the value in its QTY column does not violate this business rule. Consistency. Some transactions can cause multiple updates to a database. That is, if data in one table is updated, there should be a corresponding change in other liked tables. For example, accepting a customer order may involve adding a row to the ORDERS table, increasing the SALES column in the SALESREPS table for the person who took the order, and increasing the SALES column in the OFFICES table for the office where that salesperson is assigned. The INSERT and both UPDATEs must all take place in order for the database to remain in a consistent, correct state. The DBMS can be asked to enforce this type of consistency rule or to support applications that implement such rules.3 What are the techniques of simple validity checking? SQL provides a data validation capability by allowing us to create a rule that determines what data can be entered into a particular column. SQL checks the rule each time an INSERT or UPDATE statement is attempted for the table that contains the column. Ex 1: To create a rule for the QUOTA column in the SALESREPS table: CREATE RULE QUOTA_LIMIT AS @VALUE BETWEEN 0.00 AND 500000.00 VALIDITY CHECKING TECHNIQUES There are two techniques for simple validity checking: Column Check constraints & Domains Column Check Constraints: A check constraint is a search condition, which produces a true/false value. When a check constraint is specified for a column, DBMS automatically checks the value of that column each time a new row is inserted or a row is updated to insure that the search condition is true. If the search condition is not true, the INSERT or UPDATE statement fails. A column check constraint is given as part of the column definition within the CREATE TABLE statement. Ex: CREATE TABLE SALESREPS (EMPL_NUM INTEGER NOT NULL CHECK (EMPL_NUM BETWEEN 101 AND 199), AGE INTEGER CHECK (AGE >= 21), . QUOTA MONEY CHECK (MONEY >= 0.0) . Consider the constraint CHECK (EMPL_NUM BETWEEN 101 AND 199), This constraint requires that valid employee numbers be three-digit numbers between 101 and 199. Now consider the constraint on the AGE column : CHECK (AGE >= 21) The third constraint (on the QUOTA column) is CHECK (MONEY >= 0.0)Page 2 mukeshtekwani@hotmail.com
    • SQL – Ch 10 – Data Integrity Domains: A domain a collection of permitted values. These permitted values can be applied to not just one column but many columns. We first create a domain by using the CREATE DOMAIN statement, as follows: CREATE DOMAIN VALID_EMPLOYEE_ID INTEGER CHECK (VALUE BETWEEN 101 AND 199) Once the VALID_EMPLOYEE_ID domain has been defined, it may be used to define columns in database tables instead of a data type. Now we can write the CREATE TABLE statement for the SALESREPS table as follows: CREATE TABLE SALESREPS (EMPL_NUM VALID_EMPLOYEE_ID, AGE INTEGER CHECK (AGE >= 21), . . QUOTA MONEY CHECK (MONEY >= 0.0) Advantages of using Domains: 1. The advantage of using the domain is that if other columns in other tables also contain employee numbers, the domain name can be used repeatedly, thus simplifying the table definitions. 2. The definition of "valid data" (such as valid employee numbers in this example) is stored in one, central place within the database. If the definition changes later (for example, if the company grows and employee numbers in the range 200-299 must be allowed), it is much easier to change one domain definition than to change many column constraints scattered throughout the database.4 Explain what is meant by “entity integrity” A tables primary key must have a unique value for each row of the table.. For example, two rows of the SALESREPS table cannot have the value 106 in their EMPL_NUM column. Therefore we impose the restriction that the primary key must have a unique value. This is called the entity integrity constraint. When a primary key is specified for a table, the DBMS automatically checks the uniqueness of the primary key value for every INSERT and UPDATE statement performed on the table. If we attempt to insert a row with a duplicate primary key value or to update a row so that its primary key would be a duplicate, it will fail and generate an error message.5 What is referential integrity? A set of columns in a table that corresponds to the primary key in another table is called as a foreign key. For example, consider the EmpNumber (primary key in Employees table). These values are also used in the Orders table. In the Orders table, this column is called as the foreign key. Any values used in the foreign key column in Orders table must point to or refer to an existing primary key in the Employees table. Hence this type of integrity is called as referential integrity. This rule enforces the integrity of the parent/child relationship which is created by the primary key / foreign key combination.6 In what ways can referential integrity of a database be affected? 1. Inserting a new child row. When an INSERT statement adds a new row to the child table, its foreign key value must match one of the primary key values in the parent table. If the foreignProf. Mukesh N. Tekwani [9869 488 356] Page 3
    • SQL - Ch 10 – Data Integrity key value does not match any primary key, inserting the row will corrupt the database, because there will be a child without a parent (an "orphan"). Inserting a row in the parent table never creates any problem; because the new row simply becomes a parent without any children. This problem is handled by checking the values of the foreign key columns before the INSERT statement is permitted. If they dont match a primary key value, the INSERT statement is rejected with an error message. 2. Updating the foreign key in a child row. If the foreign key is modified by an UPDATE statement, the new value must match a primary key value in the parent table. Otherwise, the updated row will be an orphan. This problem is handled by checking the updated foreign key value. If there is no matching primary key value, the UPDATE statement is rejected with an error message. 3. Deleting a parent row. If a row of the parent table that has one or more children is deleted, the child rows will become orphans. The foreign key values in these rows will no longer match any primary key value in the parent table. Deleting a row from the child table will not create any problem because the parent of this row simply has one less child after the deletion. This problem requires a different approach. We can do one of the following: a) Prevent the deletion of parent row until all foreign keys are reassigned a new value. b) Automatically delete the dependent child rows. c) Set the foreign key value of such records to NULL. d) Set the foreign key value of such records to some default value. 4. Updating the primary key in a parent row. If the primary key of a row in the parent table is modified, all of the current children of that row become orphans because their foreign keys no longer match a primary key value. This problem has similar complexity. Again, there are four logical possibilities: a) Prevent the primary key from being changed until the foreign keys are reassigned. b) Automatically update the foreign key. c) Set the foreign key to NULL. d) Set the foreign key to some default value.7 What are the delete rules to enforce database integrity? Whenever a parent/child relationship is created by a foreign key in a database, we can specify an associated delete rule. The delete rule tells the DBMS what to do when a user tries to delete a row of the parent table. These four delete rules are: 1. The RESTRICT delete rule prevents us from deleting a row from the parent table if the row has any children. A DELETE statement that attempts to delete such a parent row generates an error message. Only those rows can be deleted from the parent that have no child rows. 2. The CASCADE delete rule tells the DBMS that when a parent row is deleted, all of its child rows should also automatically be deleted from the child table. 3. The SET NULL delete rule tells the DBMS that when a parent row is deleted, the foreign key values in all of its child rows should automatically be set to NULL. Therefore when a row is deleted from the parent table it causes a "set to NULL" update on selected columns of the child table. 4. The SET DEFAULT delete rule tells the DBMS that when a parent row is deleted, the foreign key values in all of its child rows should automatically be set to the default value for that particular column. Thus, deletions from the parent table cause a "set to DEFAULT" update on selected columns of the child table.Page 4 mukeshtekwani@hotmail.com
    • SQL – Ch 10 – Data Integrity8 What are the update rules to enforce database integrity? The update rule tells the DBMS what to do when a user tries to update the value of one of the primary key columns in the parent table. There are four possibilities: 1. The RESTRICT update rule prevents you from updating the primary key of a row in the parent table if that row has any children. An UPDATE statement that attempts to modify the primary key of such a parent row is rejected with an error message. 2. The CASCADE update rule tells the DBMS that when a primary key value is changed in a parent row, the corresponding foreign key value in all of its child rows should also automatically be changed in the child table, to match the new primary key. 3. The SET NULL update rule tells the DBMS that when a primary key value in a parent row is updated, the foreign key values in all of its child rows should automatically be set to NULL. Primary key changes in the parent table cause a "set to NULL" update on selected columns of the child table. 4. The SET DEFAULT update rule tells the DBMS that when a primary key value in a parent row is updated, the foreign key values in all of its child rows should automatically be set to the default value for that particular column. Primary key changes in the parent table cause a "set to DEFAULT" update on selected columns of the child table.9 What is a trigger? 1. Triggers are stored procedures that are executed automatically when a particular event occurs. A trigger can also be defined as a piece of code which is activated by DBMS if a specific operation is executed on the database, and only when a certain condition holds. 2. Triggers are used to enforce data integrity. 3. They are similar to constraints. 4. The following three events can trigger an action: INSERT, DELETE and UPDATE. 5. The action triggered by an event is given as a sequence of SQL statements. 6. Triggers provide an alternative way to enforce referential integrity. 7. Triggers are activated automatically. They cannot be called by the user. 8. Triggers are created using the CREATE TRIGGER command. Syntax: CREATE TRIGGER trigger-name ON table-name FOR which-event (INSERT | DELETE | UPDATE) AS trigger-code 9. Triggers are removed by using the DROP TRIGGER command. Example: DROP TRIGGER triggername Example 1: Create a trigger to disallow any rows with a budget of over 100 in the Movies table: CREATE TRIGGER movies_insert ON Movies FOR INSERT AS BEGIN IF budget > 100 BEGIN ROLLBACK TRANSACTION PRINT “Transaction not permitted for budget over 100” END END Now consider the following INSERT query: INSERT INTO Movies (movie_id, studio_id, director_id, gross, budget, release_date)Prof. Mukesh N. Tekwani [9869 488 356] Page 5
    • SQL - Ch 10 – Data Integrity VALUES (15, ‘Test Movie’, 3, 5, 50, 101, GETDATE()) Note that the budget here is 101 (bold value). SQL will give the error “Insertion into Movies not allowed”. The trigger is fired after after the execution of the statement is finished.10 State the advantages and disadvantages of triggers. Advantages: 1. The major advantage of triggers is that business rules can be stored in the database and enforced consistently with each update to the database. This can reduce the complexity of application programs that access the database. 2. Triggers can be used to enforce referential integrity. Disadvantages: 1. Database complexity. When the rules are moved into the database, setting up the database becomes a more complex task. 2. Hidden rules. Since the rules are hidden away inside the database, programs may generate an enormous amount of database activity. The programmer no longer has total control over what happens to the database. A program-initiated database action may cause other, hidden actions. 3. Hidden performance implications. Since triggers stored inside the database, the consequences of executing a SQL statement are no longer completely visible to the programmer.Page 6 mukeshtekwani@hotmail.com