Sql ch 9 - data integrity

SQL – Ch 10 – Data Integrity

10. DATA INTEGRITY
1. What does the term data integrity mean?
The term data integrity refers to the correctness and completeness of the data in a database. When the
contents of a database are modified with the INSERT, DELETE, or UPDATE statements, the integrity of the
stored data can be lost in many different ways.

For example:
• Invalid data may be added to the database, e.g., order for a nonexistent product.
• Existing data may be modified to an incorrect value, e.g., reassigning a salesperson to a nonexistent
office.
• Changes to the database may be lost due to a system error or power failure.
• Changes may be partially applied, such as adding an order for a product without adjusting the quantity
available for sale.

To preserve the consistency and correctness of its data, a RDBMS imposes one or more data integrity
constraints. These constraints restrict the data values that can be inserted into the database or created by
a database update.

The different types of data integrity constraints are:
• Required data.
• Validity checking.
• Entity integrity.
• Referential integrity
• Other data relationships.
• Business rules
• Consistency

2 Explain the different data integrity constraints
The term data integrity refers to the correctness and completeness of the data in a database. When the
contents of a database are modified with the INSERT, DELETE, or UPDATE statements, the integrity of the
stored data can be lost in many different ways.

The different types of data integrity constraints are:

Required data. Some columns in a database must contain a valid data value in every row; they are not
allowed to contain missing or NULL values. E.g., every order must have an associated customer who
placed the order. Therefore, the CUST column in the ORDERS table is a required column. The DBMS can
be asked to prevent NULL values in this column.

Validity checking. Every column in a database has a domain, or a set of data values that are permitted
for that column. E.g., order numbers that begin at 100,001, so the domain of the ORDER_NUM column
is positive integers greater than 100,000. Similarly, employee numbers in the EMPL_NUM column must
fall within the numeric range of 101 to 999. The DBMS can be asked to prevent other data values in
these columns.

Entity integrity. The primary key of a table must contain a unique value in each row, which is different
from the values in all other rows. E.g., each row of the PRODUCTS table has a unique set of values in its
MFR_ID and PRODUCT_ID columns. Duplicate values are illegal. The DBMS can be asked to enforce this
unique values constraint.

Referential integrity. A foreign key in a relational database links each row in the child table containing
the foreign key to the row of the parent table containing the matching primary key value. In the sample
database,

Prof. Mukesh N. Tekwani [9869 488 356] Page 1

SQL - Ch 10 – Data Integrity

Other data relationships. Other constraints may be enforced on the database. For example, the quota
target for each office must not exceed the total of the quota targets for the salespeople in that office.
The DBMS can be asked to check modifications to the office and salesperson quota targets to make
sure that their values are constrained in this way.

Business rules. Updates to a database may be prevented by business rules governing the real-world
transactions that are represented by the updates. E.g., there may be a business rule that forbids
accepting an order for which there is an inadequate product inventory. The DBMS can be asked to
check each new row added to the ORDERS table to make sure that the value in its QTY column does not
violate this business rule.

Consistency. Some transactions can cause multiple updates to a database. That is, if data in one table
is updated, there should be a corresponding change in other liked tables. For example, accepting a
customer order may involve adding a row to the ORDERS table, increasing the SALES column in the
SALESREPS table for the person who took the order, and increasing the SALES column in the OFFICES
table for the office where that salesperson is assigned. The INSERT and both UPDATEs must all take
place in order for the database to remain in a consistent, correct state. The DBMS can be asked to
enforce this type of consistency rule or to support applications that implement such rules.

3 What are the techniques of simple validity checking?
SQL provides a data validation capability by allowing us to create a rule that determines what data can be
entered into a particular column. SQL checks the rule each time an INSERT or UPDATE statement is
attempted for the table that contains the column.

Ex 1: To create a rule for the QUOTA column in the SALESREPS table:
CREATE RULE QUOTA_LIMIT
AS @VALUE BETWEEN 0.00 AND 500000.00

VALIDITY CHECKING TECHNIQUES
There are two techniques for simple validity checking: Column Check constraints & Domains

Column Check Constraints:
A check constraint is a search condition, which produces a true/false value. When a check
constraint is specified for a column, DBMS automatically checks the value of that column each time
a new row is inserted or a row is updated to insure that the search condition is true. If the search
condition is not true, the INSERT or UPDATE statement fails. A column check constraint is given as
part of the column definition within the CREATE TABLE statement.

Ex:
CREATE TABLE SALESREPS
(EMPL_NUM INTEGER NOT NULL
CHECK (EMPL_NUM BETWEEN 101 AND 199),
AGE INTEGER
CHECK (AGE >= 21),
.
QUOTA MONEY
CHECK (MONEY >= 0.0)
.
Consider the constraint CHECK (EMPL_NUM BETWEEN 101 AND 199),
This constraint requires that valid employee numbers be three-digit numbers between 101 and
199.

Now consider the constraint on the AGE column : CHECK (AGE >= 21)

The third constraint (on the QUOTA column) is CHECK (MONEY >= 0.0)

Page 2 mukeshtekwani@hotmail.com


Domains:
A domain a collection of permitted values. These permitted values can be applied to not just one column
but many columns.

We first create a domain by using the CREATE DOMAIN statement, as follows:

CREATE DOMAIN VALID_EMPLOYEE_ID INTEGER
CHECK (VALUE BETWEEN 101 AND 199)

Once the VALID_EMPLOYEE_ID domain has been defined, it may be used to define columns in
database tables instead of a data type.

Now we can write the CREATE TABLE statement for the SALESREPS table as follows:

CREATE TABLE SALESREPS
(EMPL_NUM VALID_EMPLOYEE_ID,
AGE INTEGER CHECK (AGE >= 21),
.
.
QUOTA MONEY
CHECK (MONEY >= 0.0)

Advantages of using Domains:
1. The advantage of using the domain is that if other columns in other tables also contain
employee numbers, the domain name can be used repeatedly, thus simplifying the table
definitions.
2. The definition of "valid data" (such as valid employee numbers in this example) is stored in
one, central place within the database. If the definition changes later (for example, if the
company grows and employee numbers in the range 200-299 must be allowed), it is much
easier to change one domain definition than to change many column constraints scattered
throughout the database.

4 Explain what is meant by “entity integrity”
A table's primary key must have a unique value for each row of the table.. For example, two rows
of the SALESREPS table cannot have the value 106 in their EMPL_NUM column. Therefore we
impose the restriction that the primary key must have a unique value. This is called the entity
integrity constraint.

When a primary key is specified for a table, the DBMS automatically checks the uniqueness of the
primary key value for every INSERT and UPDATE statement performed on the table. If we attempt
to insert a row with a duplicate primary key value or to update a row so that its primary key would
be a duplicate, it will fail and generate an error message.

5 What is referential integrity?
A set of columns in a table that corresponds to the primary key in another table is called as a
foreign key. For example, consider the EmpNumber (primary key in Employees table). These
values are also used in the Orders table. In the Orders table, this column is called as the foreign
key.

Any values used in the foreign key column in Orders table must point to or refer to an existing
primary key in the Employees table. Hence this type of integrity is called as referential integrity.
This rule enforces the integrity of the parent/child relationship which is created by the primary key /
foreign key combination.

6 In what ways can referential integrity of a database be affected?
1. Inserting a new child row. When an INSERT statement adds a new row to the child table, its
foreign key value must match one of the primary key values in the parent table. If the foreign



key value does not match any primary key, inserting the row will corrupt the database, because
there will be a child without a parent (an "orphan"). Inserting a row in the parent table never
creates any problem; because the new row simply becomes a parent without any children.

This problem is handled by checking the values of the foreign key columns before the INSERT
statement is permitted. If they don't match a primary key value, the INSERT statement is
rejected with an error message.

2. Updating the foreign key in a child row. If the foreign key is modified by an UPDATE
statement, the new value must match a primary key value in the parent table. Otherwise, the
updated row will be an orphan.

This problem is handled by checking the updated foreign key value. If there is no matching
primary key value, the UPDATE statement is rejected with an error message.

3. Deleting a parent row. If a row of the parent table that has one or more children is deleted,
the child rows will become orphans. The foreign key values in these rows will no longer match
any primary key value in the parent table. Deleting a row from the child table will not create any
problem because the parent of this row simply has one less child after the deletion.

This problem requires a different approach. We can do one of the following:
a) Prevent the deletion of parent row until all foreign keys are reassigned a new value.
b) Automatically delete the dependent child rows.
c) Set the foreign key value of such records to NULL.
d) Set the foreign key value of such records to some default value.

4. Updating the primary key in a parent row. If the primary key of a row in the parent table is
modified, all of the current children of that row become orphans because their foreign keys no
longer match a primary key value.

This problem has similar complexity. Again, there are four logical possibilities:
a) Prevent the primary key from being changed until the foreign keys are reassigned.
b) Automatically update the foreign key.
c) Set the foreign key to NULL.
d) Set the foreign key to some default value.

7 What are the delete rules to enforce database integrity?
Whenever a parent/child relationship is created by a foreign key in a database, we can specify an
associated delete rule. The delete rule tells the DBMS what to do when a user tries to delete a row
of the parent table. These four delete rules are:

1. The RESTRICT delete rule prevents us from deleting a row from the parent table if the row has
any children. A DELETE statement that attempts to delete such a parent row generates an
error message. Only those rows can be deleted from the parent that have no child rows.

2. The CASCADE delete rule tells the DBMS that when a parent row is deleted, all of its child
rows should also automatically be deleted from the child table.

3. The SET NULL delete rule tells the DBMS that when a parent row is deleted, the foreign key
values in all of its child rows should automatically be set to NULL. Therefore when a row is
deleted from the parent table it causes a "set to NULL" update on selected columns of the
child table.

4. The SET DEFAULT delete rule tells the DBMS that when a parent row is deleted, the foreign
key values in all of its child rows should automatically be set to the default value for that
particular column. Thus, deletions from the parent table cause a "set to DEFAULT" update on
selected columns of the child table.



8 What are the update rules to enforce database integrity?
The update rule tells the DBMS what to do when a user tries to update the value of one of the
primary key columns in the parent table. There are four possibilities:

1. The RESTRICT update rule prevents you from updating the primary key of a row in the
parent table if that row has any children. An UPDATE statement that attempts to modify the
primary key of such a parent row is rejected with an error message.

2. The CASCADE update rule tells the DBMS that when a primary key value is changed in a
parent row, the corresponding foreign key value in all of its child rows should also
automatically be changed in the child table, to match the new primary key.

3. The SET NULL update rule tells the DBMS that when a primary key value in a parent row
is updated, the foreign key values in all of its child rows should automatically be set to
NULL. Primary key changes in the parent table cause a "set to NULL" update on selected
columns of the child table.

4. The SET DEFAULT update rule tells the DBMS that when a primary key value in a parent
row is updated, the foreign key values in all of its child rows should automatically be set to
the default value for that particular column. Primary key changes in the parent table cause
a "set to DEFAULT" update on selected columns of the child table.

9 What is a trigger?
1. Triggers are stored procedures that are executed automatically when a particular event occurs.
A trigger can also be defined as a piece of code which is activated by DBMS if a specific
operation is executed on the database, and only when a certain condition holds.
2. Triggers are used to enforce data integrity.
3. They are similar to constraints.
4. The following three events can trigger an action: INSERT, DELETE and UPDATE.
5. The action triggered by an event is given as a sequence of SQL statements.
6. Triggers provide an alternative way to enforce referential integrity.
7. Triggers are activated automatically. They cannot be called by the user.
8. Triggers are created using the CREATE TRIGGER command.
Syntax:
CREATE TRIGGER trigger-name
ON table-name
FOR which-event (INSERT | DELETE | UPDATE)
AS
trigger-code
9. Triggers are removed by using the DROP TRIGGER command.
Example: DROP TRIGGER triggername

Example 1: Create a trigger to disallow any rows with a budget of over 100 in the Movies table:
CREATE TRIGGER movies_insert
ON Movies
FOR INSERT
AS
BEGIN
IF budget > 100
BEGIN
ROLLBACK TRANSACTION
PRINT “Transaction not permitted for budget over 100”
END
END

Now consider the following INSERT query:
INSERT INTO Movies (movie_id, studio_id, director_id, gross, budget, release_date)



VALUES (15, ‘Test Movie’, 3, 5, 50, 101, GETDATE())

Note that the budget here is 101 (bold value). SQL will give the error “Insertion into Movies not
allowed”. The trigger is fired after after the execution of the statement is finished.

10 State the advantages and disadvantages of triggers.
Advantages:
1. The major advantage of triggers is that business rules can be stored in the database and
enforced consistently with each update to the database. This can reduce the complexity of
application programs that access the database.
2. Triggers can be used to enforce referential integrity.

Disadvantages:
1. Database complexity. When the rules are moved into the database, setting up the database
becomes a more complex task.

2. Hidden rules. Since the rules are hidden away inside the database, programs may generate
an enormous amount of database activity. The programmer no longer has total control over
what happens to the database. A program-initiated database action may cause other, hidden
actions.

3. Hidden performance implications. Since triggers stored inside the database, the
consequences of executing a SQL statement are no longer completely visible to the
programmer.


Sql ch 9 - data integrity

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Sql ch 9 - data integrity

Similar to Sql ch 9 - data integrity (20)

More from Mukesh Tekwani

More from Mukesh Tekwani (20)

Recently uploaded

Recently uploaded (20)

Sql ch 9 - data integrity