2. 2
RDBMS Training Assumptions
Note:-
During Training session we will be using Oracle as a
RDBMS and all the Examples, Assignments and Test given
in this course Material are also based on Oracle SQL Plus
3. 3
Course Topics
Introduction
What is a database management
system?
Users of a DBMS
Data Models
Keys
ERD
Functional dependencies
Normal forms
Normalization
Design (Logical)
10. 10
Data Models
Hierarchical Model –
The Oldest Data Model among the Data models listed,
used for space programs mainly, 1st Hierarchical DBMS
is “IMS” released in 1968
Network Model –
This model is to provide an accurate representation of a
network as a set of links and nodes
Relational Model
11. 11
Relational Data Model
Based on Relational algebra i. e. mathematical
theory of relations.
Relational Model first described by
E.F. Codd in 1970 and then by IBM
Presents data in form of tables.
12. 12
Relational Data Model
Later Dr. Codd clarified his model by
defining twelve rules (Codd’s Rules)
that a database management system
(DBMS) must meet in order to be
considered a relational database
• In practice, many database products are
considered 'relational' even if they do not
strictly adhere to all 12 rules
13. 13
Basic Terms and Definitions
The term 'database' has many interpretations; one definition is a 'collection of
persistent data'
A relational database is one in which the data consists of a 'collection of
tables related to each other through common values'
The two most prominent characteristics of a relational database are 1) data
stored in tables and 2) relationships between tables
A table (a.k.a. an entity or relation) is a collection of rows and columns
A row (a.k.a. a record or tuple) represents a collection of information
about a separate item (e.g., a customer)
A column (a.k.a. a field or attribute) represents the characteristics of an
item (e.g., the customer's name or phone number)
A relationship (a.k.a. a join) is a logical link between two tables
A relational database management system (RDBMS) uses matching
values in multiple tables to relate the information in one table with the
information in the other table
The presentation of data as tables is a logical construct; it is independent of
the way the data is physically stored on disk
14. 14
Codd’s 12 rules
Codd's Rule #1. Data is Presented in Tables
A set of related tables forms a database and all data is represented as tables, the
data can be viewed in no other way
Codd's Rule #2. Data is Logically Accessible
A relational database does not reference data by physical location, there is no
such thing as the ‘fifth row in the customers table'
Each piece of data must be logically accessible by referencing 1) a table 2) a primary
or unique key value and 3) a column
Codd's Rule #3. Nulls are Treated Uniformly As Unknown
Null must always be interpreted as an unknown value
Codd's Rule #4. Database is Self-Describing
In addition to user data, a relational database contains data about itself
There are two types of tables in a RDBMS: user tables that contain the 'working' data
and system tables contain data about the database structure
Metadata is data that describes the structure of the database itself and includes
object definitions (tables, indexes, stored procedures, etc.) and how they relate to
each other
The collection of system tables is also referred to as the system catalog or data
dictionary
15. 15
Codd’s 12 rules
Codd's Rule #5. A Single Language is Used to Communicate with the Database
Management System
There must be a single language that handles all communication with the database
management system
The language must support relational operations with respect to: data modification
(i.e., SELECT, INSERT, UPDATE, DELETE), data definition (i.e., CREATE, ALTER,
DROP) and administration (i.e., GRANT, REVOKE, DENY, BACKUP, RESTORE)
Structured Query Language (SQL) is the de facto standard for a relational database
language
SQL is a ‘nonprocedural’ or ‘declarative’ language; it allows users to express what
they want from the RDBMS without specifying the details about where it's located or
how to get it
Codd's Rule #6. Provides Alternatives for Viewing Data
A relational database must not be limited to source tables when presenting data to
the user
Views are virtual tables or abstractions of the source tables
Views allow the creation of ‘custom tables’ that are tailored to special needs
16. 16
Codd’s 12 rules
Codd's Rule #7. Supports Set-Based or Relational Operations
Rows are treated as sets for data manipulation operations (SELECT, INSERT,
UPDATE, DELETE)
A relational database must support basic relational algebra operations (selection,
projection; & join ) and set operations (union, intersection, division, and
difference)
Set operations and relational algebra are used to operate on 'relations' (tables) to
produce other relations
A database that supports only row-at-a-time (navigational) operations does not
meet this requirement and is not considered 'relational'
Codd's Rule #8. Physical Data Independence
Applications that access data in a relational database must be unaffected by
changes in the way the data is physically stored (i.e., the physical structure)
Codd's Rule #9. Logical Data Independence
The database schema or structure of tables and relationships (logical) can change
without having to re-create the database or the applications that use it
17. 17
Codd’s 12 rules
Codd's Rule #10. Data Integrity Is a Function of the DBMS
In order to be considered relational, data integrity must be an internal function
of the DBMS; not the application program
Data integrity means the consistency and accuracy of the data in the database
(i.e., keeping the garbage out of the database)
Codd's Rule #11. Supports Distributed Operations
Data in a relational database can be stored centrally or distributed
Users can join data from tables on different servers (distributed queries) and
from other relational databases (heterogeneous queries)
Data integrity must be maintained regardless of the number of copies of data
and where it resides
Codd's Rule #12. Data Integrity Cannot be Subverted
There cannot be other paths into the database that subvert data integrity; in
other words, you can't get in the 'back door' and change the data in such a
manner as data integrity is violated
18. 18
Keys
Candidate key: A single attribute or set
of attributes which uniquely identifies a
row in the table.
Primary key: A candidate key chosen
for the database design.
19. 19
Non-key attribute: Attribute that does not
participate in any candidate key
Overlapping candidate key: Two candidate
keys overlap if they involve two or more
attributes each (composite candidate key) and
have one or more attribute in common
Foreign key
Self referencing key
Keys
20. 20
Based on the perception of the real world, represents
the basic components of the real world
Real-world E-R-D Definition
Object Entity Anything of capable of
independent existence ,&
is of interest to us
Property Attribute Characteristics of the object
Relationship Relationship Association between objects
Symbol
Entity Relationship Diagram (ERD)
21. 21
Relationship
A relationship:
is an association among several entities;
has a constraint for mapping cardinalities
Customer
Name Balance
AccNo
City
CustNo
Depositor Account
22. 22
Mapping cardinalities or cardinality ratio,
express the number of entities to which
another entity can be associated via a
relationship set:
One to one
One to Many
Many to one
Many to many
Mapping cardinalities
23. 23
Schematic ER models
Each DEPARTMENT has many LECTURERS, one of
whom is head of the DEPARTMENT;
A LECTURER belongs to only one DEPARTMENT;
Each DEPARTMENT offers many different COURSES;
Each COURSE is taught by a single LECTURER;
A STUDENT may enroll for many COURSES offered
by different DEPARTMENTS.
24. 24
Draw Relationships of college DB
1:1 relationship
between LECTURER & DEPARTMENT;
N:1 relationship
between LECTURER & DEPARTMENT
1:N relationship
between DEPARTMENT & COURSE;
M:N relationship
between STUDENT & COURSE
1:N relationship
between LECTURER & COURSE.
25. 25
Relationships of college DB
1:1 relationship HEAD_OF
between LECTURER & DEPARTMENT;
N:1 relationship IS_IN
between LECTURER & DEPARTMENT
1:N relationship OFFERS
between DEPARTMENT & COURSE;
M:N relationship ENROLLS
between STUDENT & COURSE
1:N relationship TEACHES
between LECTURER & COURSE.
26. 26
ER diagram of college DB
DEPARTMENT
C#
Room
LName
DName Location
Offers
STUDENT
COURSE LECTURER
HeadOf
Is_In
Teaches
Title
Enrolls
S#
SName
N
N N
N
M
27. 27
Transforming ER model
Transform the entity types
Transform the binary relationships
Transform the N:M relationships
The following is a set of guidelines for
converting ER model into a relational model:
28. 28
Advantages of ER Models
Intuitive
Helps identify entities & relationships
Raises pertinent questions
Can be understood by non-specialist
Reflects the natural structure of DB
Flexible & extendible
31. 31
Normalization
Functional Dependency - example
•Each airport name is unique
and each airport can be in
only one city. Therefore, City
is functionally dependent on
Airport.
•The value in the Airport field
determines what the value
will be in the City field and
there can be only one value
in the City field.
•This does not need to work in the reverse. As shown in the
table, a city can have more than one airport, so Airport is not
functionally dependent on City; the value in City does not
necessarily determine what the value in Airport will be.
Airport Name City
National Washington, DC
JFK New York
LaGuardia New York
Logan Boston
Dulles Washington, DC
32. 32
Normalization
Functional Dependency - Definition
A functional dependency is a relationship
between fields so that the value in Field A
determines the value in Field B, and there can be
only one value in Field B. In that case, Field B is
functionally dependent on Field A.
For any given value for attribute A, there is just
one corresponding value of attribute B.
Many distinct values of the attribute A can have
the same corresponding value for attribute B.
33. 33
Normalization
Full Functional Dependence Vs. Partial Functional Dependence
Full Functional Dependence – Definition
Given a relation R, attribute B of R is fully functionally dependent on
attribute A of R, if it is functionally dependent on A and not functionally
dependent on any subset of A (A must be composite).
•Cname is fully functionally dependent on C#
•Ccity and Cphone are also fully functionally dependent on C#
•Qnt is fully functionally dependent on (C#, P#, Date – composite key)
•Cname is not fully functionally dependent on (C#, P#, Date), it is only
partially dependent on it (and similarly for Ccity and Cphone).
34. 34
Normalization
Transitive Dependency
A transitive dependency is a type of functional dependency
in which the value in a non-key field is determined by the value
in another non-key field and that field is not a candidate key.
•The phone number is
dependent on the
manager, which is
dependent on the
project number (a
transitive dependency).
•The ProjectMgr field is
not a candidate key
because the same
person manages more
than one project.
36. 36
Normalization
Normal Forms - Definitions
A relation is said to be in first normal form
(1NF) if and only if all underlying domains
contain atomic values only i.e. there are no
repeating groups.
A relation is said to be in second normal form
(2NF) if and only if it is in 1NF and every non
key attribute is fully dependent on the entire
primary key.
A relation is said to be in third normal form
(3NF) if and only if it is in 2NF and every non
key attribute is non-transitively dependent on
the primary key.
37. 37
Normalization
Example
A Timesheet Application
A person can work on three different projects at the same time.
The attributes of the relation are
Emp (empno, ename, deptno, deptname, projno1, projname1, proj-
comp-date1, hours-in-proj1, projno2, projname2, proj-comp-date2,
hours-in-proj2, projno3, projname3, proj-comp-date3, hours-in-proj3 )
The sample data -
EmpNo EName DeptNo DeptNAme ProjNo1_3 ProjName1_3 ProjCompDate1_3 HoursInProj1_3
26 Jack N. D46 Data Mining P1 Value
Customers
12-Oct-2004 45
P2 Sales Trends 24-Feb-2005 73
35 Tom C. D72 MIS P2 Sales Trends 24-Feb-2005 87
P3 B2B 13-Mar-2005 65
P4 MRP 08-Jun-2005 25
40 H. Ford D67 Robotics P4 MRP 08-Jun-2005 91
38. 38
Normalization
Example – Continued…
1NF – Atomic Values – No repeating groups
The EMP table will therefore look like this
EMP(Empno, Projno, ename, deptno, deptname,
projname, comp-date, hours)
EmpNo EName DeptNo DeptNAme ProjNo ProjName ProjCompDate HoursInProj
26 Jack N. D46 Data Mining P1 Value Customers 12-Oct-2004 45
26 Jack N. D46 Data Mining P2 Sales Trends 24-Feb-2005 73
35 Tom C. D72 MIS P2 Sales Trends 24-Feb-2005 87
35 Tom C. D72 MIS P3 B2B 13-Mar-2005 65
35 Tom C. D72 MIS P4 MRP 08-Jun-2005 25
40 H. Ford D67 Robotics P4 MRP 08-Jun-2005 91
39. 39
Normalization
Example – Continued…
2NF - Every non-key attribute is fully dependent on the
entire primary key.
Primary Key – EmpNo
Is Ename fully dependent
on EmpNo?
Is DeptNo fully dependent
on EmpNo?
Is DeptName fully
dependent on EmpNo?
Is ProjNo fully dependent on
EmpNo?
ProjName, ProjCompDate
and HoursInProj ?
EmpNo*
EmpName
DeptNo
DeptName
ProjNo*
ProjName
ProjCompDate
ProjNo*
EmpNo*
HoursInProj
40. 40
Normalization
Example – Continued…
3NF - Every non key attribute is non-
transitively dependent on the primary key
In the relation EMP, we have
EMP (Empno, EName, DeptNo, DeptName)
DeptNo -> EmpNo
DeptName -> DeptNo
So it is not in 3NF.
After removing 3NF, the relations become
EMP (EmpNo*, EName, DeptNo)
DEPT (DeptNo*, DeptName)
41. 41
Normalization
Exercise – Bills Database
BILLNO
BILLDATE
CUSTOMERNO
CNAME
ADDRESS_1
ADDRESS_2
ADDRESS_3
CITY
ITEM_DETAILS *(1-7)
ITEMNO
ITEMDESC
ITEM_UNIT_PRICE
QTY_SOLD
ITEM_VALUE
BILL_AMOUNT
AMT_PAID
AMT_PENDING
REMARKS
Assumptions-
ADDRESS_1, 2 and 3 are 3
lines of the one single address.
There can be at the most 7
items per bill.
AMT_PENDING is the balance
amount payable.
ITEM_VALUE =
ITEM_UNIT_PRICE * QTY SOLD
Derive 1NF, 2NF and 3NF for
the bills database.
42. 42
SQL is the language that the RDBMS
understands.
It helps us to:
Create tables
Make changes in the tables
Impose relationships between tables
Enter, delete, update & retrieve data.
SQL – Structured Query Language
43. 43
SQL - Background
Conceived in mid-1970’s as a database
language for the relational model
Developed by IBM
First standardized in 1986 by ANSI
Enhanced in 1989
Revised again in ‘92
It is a non procedural language
Used in a number of commercial
products
52. 52
Supplier table - S
SNO SNAME STATUS CITY
S1 Smith 20 London
S2 Jones 10 Paris
S3 Blake 30 Paris
S4 Clark 20 London
S5 Adams 30 Athens
53. 53
Product table - P
PNOPNAME COLOR WEIGHT CITY
P1 Nut Red 12 London
P2 Bolt Green 17 Paris
P3 Screw Blue 17 Rome
P4 Screw Red 14 London
P5 Cam Blue 12 Paris
P6 Cog Red 19 London
55. 55
INSERT INTO S
VALUES (‘S3’,’SUP3’,10, ’BLORE’);
SQL - INSERT INTO
Single-row insert
Multi-row insert
56. 56
SQL - INSERT INTO
INSERT INTO S (SNO, SNAME)
VALUES (‘102’, 5000);
Inserting one row, many
columns at a time.
57. 57
SQL - INSERT INTO
INSERT INTO NEW_S (SNO, SNAME)
SELECT SNO, SNAME
FROM S
WHERE CITY IN (‘BLORE’,’MADRAS’);
Inserting many rows, all/some
columns at a time.
Ex: Copy list of Suppliers to New_S
58. 58
UPDATE S
SET CITY = ‘KANPUR’
WHERE SNO=‘S3’;
SQL - UPDATE
UPDATE EMP
SET SAL = 1.10 * SAL;
With or without WHERE clause
59. 59
SQL - DELETE FROM
Used to delete data from tables
Even if all data is deleted, the table is
not NOT deleted.
60. 60
DELETE FROM SP
WHERE PNO= ‘P1’;
SQL - DELETE FROM
DELETE FROM SP;
With or without WHERE clause
61. 61
Retrieving a column from the table
Get the names of all the suppliers
SELECT SNAME
FROM S;
64. 64
SQL - ALL, DISTINCT
SELECT PName
FROM P;
Get all part numbers
Get all distinct part numbers
SELECT DISTINCT PName
FROM P;
65. 65
Find the difference in output
SELECT SNO FROM S
SELECT DISTINCT SNO FROM S
Assumption: SNO is Primary Key
66. 66
SELECT COL1,COL2,.........
FROM TABLE NAME
WHERE < SEARCH CONDITION>;
Retrieving a subset of rows
For retrieval of rows based on some
condition, the syntax is
67. 67
SELECT SNO
FROM S
WHERE CITY = ‘PARIS’;
Relational
operator
= , < , > , <= , >= , != or < >
Relational operators
Get SNO for all suppliers in
Paris
68. 68
SELECT SNO
FROM S
WHERE CITY = ‘Paris’ AND STATUS >10;
Logical operator: AND, OR, and NOT
Logical operators
Get SNO for all suppliers in Paris
and status greater than 10
69. 69
Get PNO for parts whose weight is
one of 12,16 or 17 ?
SELECT PNO
FROM P
WHERE WEIGHT=16
OR
WEIGHT =12
OR
WEIGHT = 17 ;
Using logical operators..
70. 70
Get parts whose weight is in the range
16 to 19 (inclusive)
SELECT * FROM P
WHERE WEIGHT BETWEEN 16 AND 19;
• BETWEEN operator can be modified using AND
Retrieval using Comparison operators
71. 71
Get list of PNO for parts whose
weight is one of 12, 16 or 17
SELECT PNO
FROM P
WHERE WEIGHT IN (12, 16, 17);
• IN operator can be modified using OR
Retrieval using IN
72. 72
Get the list of Supplier numbers in
the cities ROME, PARIS ?
SELECT SNO
FROM S
WHERE CITY IN (‘PARIS’ ,’ROME’);
Retrieval using IN
73. 73
SELECT *
FROM P
WHERE PNAME LIKE ‘C%’;
Get all details of parts whose
name begin with character C
SELECT *
FROM P
WHERE PNO LIKE ‘P_’;
Use of LIKE..
74. 74
Get PNO for parts whose weight is
unknown (Blank or not added)
SELECT PNO FROM P
WHERE WEIGHT IS NULL;
SQL- Retrieval using IS NULL
75. 75
Get all details of shipments whose
quantity is known
SELECT * FROM SP
WHERE QTY IS NOT NULL;
SQL - Retrieval using NOT NULL
76. 76
Column titles using AS
SELECT SNO AS “Supplier Number”,
CITY AS “Supplier City”
FROM S;
78. 78
Get SNO and STATUS for suppliers in
Paris in descending order of status
SELECT SNO, STATUS
FROM S
WHERE CITY=‘Paris’
ORDER BY STATUS DESC;
SQL- Retrieval using ORDER BY
79. 79
SELECT CITY,COLOR,WEIGHT
FROM P
WHERE WEIGHT IN (12,17)
ORDER BY CITY,COLOR DESC;
SELECT CITY,COLOR,WEIGHT
FROM P
WHERE WEIGHT IN (12,17)
ORDER BY 1 DESC, 2;
SQL- Retrieval using ORDER BY
80. 80
SELECT PNO, Qty
FROM SP
WHERE Qty <1100
AND ROWNUM < 6
ORDER BY Qty DESC;
For parts that are supplied in quantity < 1100,
get top 5 part numbers and quantity supplied,
in descending order
SQL - TOP
82. 82
Get PNO and WEIGHT in grams for
all parts
SELECT PNO AS “Part”,
WEIGHT*1000 AS “Weight in grams”
FROM P;
Queries involving strings
83. 83
Get PNO and WEIGHT in grams
for all parts whose 50% of the weight
is more than 6 kg
SELECT PNO,
WEIGHT*1000 AS “WEIGHT IN GMS”
FROM P
WHERE (WEIGHT/2 > 6);
Queries involving strings
84. 84
SQL - Aggregate functions
Used when information you want to
extract from a table does not relate to
what is contained in the individual rows,
but has to do with the data in the entire
table taken as a set.
SUM( ) , AVG( ) , MAX( ) ,
MIN( ), COUNT( )
85. 85
SQL - Aggregate functions
Each of these functions performs an
action that draws data from a set of rows
rather than only from a single row.
Aggregate functions are used in place of
column names in the SELECT statement.
86. 86
Get total qty of P2 supplied
SELECT SUM (QTY)
FROM SP
WHERE PNO=‘P2’;
Aggregate function - SUM
Adds up the values in the specified column
Column must be numeric data type
Value of the sum must be within the range of that
data type
87. 87
Get average qty of shipment
supplied by S1
SELECT AVG(QTY)
FROM SP
WHERE SNO=‘S1’;
Aggregate function - AVG
Returns the average of all the values in the
specified column
Column must be numeric data type
88. 88
Get maximum qty of shipment
supplied by S1
SELECT MAX(QTY)
FROM SP
WHERE SNO =‘S1’;
Aggregate function - MAX
Returns the largest value that occurs in the
specified column
Column need not be numeric data type
89. 89
Get minimum qty of
shipment supplied by S1
SELECT MIN(QTY)
FROM SP
WHERE SNO=‘S1’;
Aggregate function - MIN
Returns the smallest value that occurs in the
specified column
Column need not be numeric data type
90. 90
Get total number of suppliers
SELECT COUNT(*)
FROM S;
Aggregate function - COUNT
Returns the number of rows in the table
91. 91
Get number of shipments for P2
SELECT COUNT(Qty)
FROM SP
WHERE PNO=‘P2’;
Aggregate function- COUNT
Count(*) = No of rows
Count(QTY) = No. of rows that do not
have NULL Value
92. 92
Using two or more aggregate functions
SELECT MIN(Qty), MAX(QTY)
FROM SP
WHERE PNO=‘P2’;
93. 93
SQL - Retrieval using GROUP BY
Related rows can be grouped together by
GROUP BY clause by specifying a column as
a grouping column.
In the output table all the rows with an
identical value in the grouping column will
be grouped together.
GROUP BY is associated with an aggregate
functions
94. 94
For each part supplied get the part
number and the total shipment quantity
SELECT PNO, SUM(QTY)
FROM SP
GROUP BY PNO;
Retrieval using GROUP BY
95. 95
Get SNO, PNO, total qty for each part supplied
SELECT SNO, PNO, SUM(QTY)
FROM SP
GROUP BY PNO;
SELECT SNO, PNO, SUM(QTY)
FROM SP
GROUP BY SNO, PNO;
Retrieval using GROUP BY
96. 96
Get PNO for parts which have more than two
shipments
SELECT PNO,COUNT(*)
FROM SP
GROUP BY PNO
HAVING COUNT(*)>2;
Retrieval using HAVING
Used to specify a condition on group
97. 97
Retrieval using HAVING
Get supplier numbers who have at least two
shipments
SELECT SNO , COUNT(*)
FROM SP
GROUP BY SNO
HAVING COUNT(*)>=2;
98. 98
Get a list of all the parts cities and supplier cities
SELECT CITY
FROM P
UNION
SELECT CITY
FROM S;
Union queries combine corresponding fields from two or
more tables or queries into one field. Duplicates are removed
unless UNION ALL used.
Retrieval using UNION
99. 99
Get a list of all common parts and supplier cities
SELECT CITY
FROM P
INTERSECT
SELECT CITY FROM S;
INTERSECT queries results in common
corresponding fields from two or more
tables or queries into one field.
Retrieval using INTERSECT
100. 100
Independent sub-queries
Inner query is independent of outer
query.
Inner query is executed first and the
results are stored.
Outer query then runs on the stored
results.
101. 101
Get supplier names for all
suppliers who supply part P2
SELECT SNAME
FROM S
WHERE SNO IN
(SELECT SNO
FROM SP
WHERE PNO =‘P2’);
Retrieval using SUB QUERIES
102. 102
Get SNO for suppliers who are
located in the same city as S1
SELECT SNO
FROM S
WHERE CITY =
(SELECT CITY
FROM S
WHERE SNO=‘S1’);
Retrieval using SUB QUERIES
103. 103
Get SNO for suppliers who supply
at least one part supplied by S2
SELECT SNO
FROM SP
WHERE PNO IN
(SELECT PNO
FROM SP
WHERE SNO=‘S2’);
Retrieval using SUB QUERIES
104. 104
SELECT SNO
FROM S
WHERE STATUS <
(SELECT STATUS
FROM S
WHERE SNO=‘S1’);
Get SNO who have status less than
the status of ‘S1’
Retrieval using SUB QUERIES
105. 105
Get supplier numbers for suppliers with
status less than the current maximum in the
supplier table
SELECT SNO
FROM S
WHERE STATUS <
(SELECT MAX(STATUS)
FROM S);
Retrieval using SUB QUERIES
106. 106
Get SNO for suppliers who do not
supply any part supplied by S2
SELECT SNO FROM SP
WHERE SNO NOT IN
(SELECT SNO FROM SP
WHERE PNO IN
(SELECT PNO FROM SP
WHERE SNO=‘S2’));
Retrieval using SUB QUERIES
107. 107
Correlated Sub Queries
While using sub-queries in
SQL, you can refer in the inner
query to the table in the FROM
clause of the outer query using
Correlated sub-queries.
The inner query is executed
separately from each row of
the outer query.
108. 108
Get PNO for all parts supplied by
more than one supplier
SELECT PNO
FROM SP X
WHERE PNO IN
(SELECT PNO
FROM SP Y
WHERE Y.SNO<>X.SNO);
Retrieval - Correlated Sub Queries
109. 109
SELECT DISTINCT SNO
FROM SP X
WHERE PNO=‘P1’
AND QTY> (SELECT AVG(QTY)
FROM SP Y
WHERE PNO=‘P1’
AND X.JNO=Y.JNO);
Get SNO for suppliers supplying some
project with P1 in a quantity greater than the
average qty of P1 supplied to that project
Retrieval - Correlated Sub Queries
111. 111
Inner Joins
This are the most common type
of join. INNER JOIN operation is
used in any FROM clause. They
combine records from two tables
whenever there are matching
values in a field common to both
tables.
112. 112
Get all combinations of supplier and
part information such that the
supplier and part are co-located.
SELECT S.*, P.*
FROM S, P
WHERE S.CITY=P.CITY;
Retrieval from Multiple tables
113. 113
Get SNO,PNO combinations where
the part’s city follows the supplier’s
alphabetically
SELECT SNO, PNO
FROM S, P
WHERE S.CITY<P.CITY;
Retrieval from Multiple tables
114. 114
Get SNO and PNO for co-located suppliers
and parts omitting suppliers with status<20
SELECT P.PNO, S.SNO
FROM P, S
WHERE S.CITY = P.CITY
AND S.STATUS > = 20;
Retrieval from Multiple tables
115. 115
Outer join
In the retrieval of values from a join, only
those values satisfying the WHERE
condition are selected.
It may be worthwhile to retrieve all rows
that match the WHERE clause and those
that have a NULL value in the column
being compared in the WHERE clause.
An outer join is used to retrieve the rows
with a NULL value ALSO in the relevant
column.
116. 116
Example of left-join
List all SNO with QTY supplied or
SNO which have not yet supplied
any QTY
SELECT S.SNO, SP.QTY
FROM S, SP
WHERE S.SNO *= SP.SNO;
All unmatched rows of S are also selected
117. 117
Self join
SELECT A.Name, B.Name
FROM SalesReps A, SalesReps B
WHERE A.Mgr = B.Empl_Num;
Joining a table with itself is a self-join.
List names of sales people and their Managers
118. 118
Get all pairs of SNO who are co-located
SELECT FIRST.SNO, SECOND.SNO
FROM S FIRST,
S SECOND
WHERE FIRST.CITY=SECOND.CITY;
Example of self join
119. 119
Get PNO’s supplied in the same QTY by
at least one different supplier
SELECT PNO
FROM SP A
WHERE EXISTS
(SELECT *
FROM SP B
WHERE A.PNO=B.PNO
AND A.QTY=B.QTY AND A.SNO<>B.SNO);
Retrieval using EXISTS
120. 120
Get all part names from parts table
which have been shipped
SELECT PNAME
FROM P
WHERE EXISTS
(SELECT *
FROM SP
WHERE SP.PNO=P.PNO);
Retrieval using EXISTS
121. 121
Get the list of all prospective suppliers, i. e.,
suppliers for whom no shipments exist yet
SELECT SNAME
FROM S
WHERE NOT EXISTS
(SELECT *
FROM SP
WHERE SP.SNO=S.SNO);
Retrieval using NOT EXISTS
122. 122
Views
The type of tables that we have been dealing up to now
are called base tables. These tables contain data.
Views are tables whose contents are taken or derived
from other tables.
Views are like windows through which you view
information that is stored in base table.
Views are operated on in queries & DML statements
just as base tables are. Views may(simple queries) or
may not be up-datable(complicated GROUP BY).
A view is actually a query that is executed whenever
the view is subject of a command.
123. 123
Creating a VIEW
CREATE VIEW ViewSupplier
AS SELECT * FROM Supplier;
Create a view from Supplier table
for City in BOM
Create a view from Supplier table
CREATE VIEW ViewSupplier
AS SELECT S.SNO, S.SName, S.City
FROM Supplier S WHERE City = ‘BHU’;
124. 124
Naming columns in a View
CREATE VIEW ViewSupplier (Supp#, Name)
AS SELECT S.SNO, S.SName
FROM Supplier S
WHERE City = ‘BHU’;
Often we do not specify new field names, but if
we do, we will have to do so for every field in
the view.
To work with a view we write:
SELECT * FROM ViewSupplier;
125. 125
CREATE UNIQUE INDEX Sup_Index
ON Supplier (SName);
DROP INDEX Sup_Index;
INDEX
An index can be created and dropped.
By default RDBMS creates index on
Primary Key
126. 126
Data Control Language
Users who create tables have control
over those tables.
Privileges are what determines whether
or not a particular user can perform a
command.
It is given and taken away with two
SQL commands:
GRANT ….. TO …
REVOKE ….. FROM ...
127. 127
Three forms of GRANT syntax
Privileges on a specified database
Privileges on specified tables or views
System privileges
128. 128
GRANT …. TO ….
Used to grant access to new users;
Permission can be granted for all DML
commands or for SELECT, UPDATE,
INSERT, DELETE individually or in some
combination;
Permission is granted on a database or
table or a view;
Permission may also be granted to
allow the new user to further grant
permissions.
129. 129
GRANT SELECT ON
Shyam can perform only queries;
He cannot perform any action that
affects the values in Customer table.
Suppose Ram owns a Customer table &
wants to let Ashutosh perform queries
on it. Ram would enter the following
command:
GRANT SELECT
ON Customer to Ashutosh;
130. 130
GRANT INSERT ON
GRANT INSERT ON Customer to Shyam ;
Would let Shyam perform a new row insertion in Customer table.
GRANT SELECT, INSERT ON Customer to Shyam;
Would let Shyam perform Queries & Insertion on Customer table.
GRANT INSERT ON Customer to Shyam, Tom;
Would let Shyam & Tom perform a new row insertion in Customer
table.
131. 131
Example of REVOKE
REVOKE INSERT
ON Customer
FROM Ashutosh;
REVOKE SELECT, INSERT
ON Customer
FROM Ashutosh, Tom;
133. 133
Concurrency : The term concurrency
refers to the fact that the DBMS allows
many transactions to access the same
database at the same time.
Transaction Management
Concurrency - Definition
134. 134
Lost Update
Dirty Read / Uncommitted Dependency
Incorrect Summary / Inconsistent Analysis
Phantom Record
Transaction Management
Concurrency Problems
135. 135
Transaction Management
Concurrency Problems
The Lost Update Problem -
Trans-A retrieves record P at time
T1.
Trans-B retrieves the same record at
time T2.
At T3, Trans-A updates the record P
on the basis of the values seen at
time T1.
At T4, Trans-B updates the same
record P on the basis of the values
seen at time t2 which are the same
as those seen at time T1.
Trans-A’s update is lost at time T4,
because transaction B overwrites it
without even looking at it.
Trans-A Time Trans-B
Retrieve P T1
T2 Retrieve P
Update P T3
T4 Update P
136. 136
Transaction Management
Concurrency Problems
Trans-A Time Trans-B
T1 Update P
Retrieve P T2
T3 Rollback
The Uncommitted Dependency Problem – Also Called ‘Dirty Read’ -
Trans-A Time Trans-B
T1 Update P
Update P T2
T3 Rollback
Trans-B updates record P at
time T1.
Trans-A sees an uncommitted
update at time T2.
This update is undone at T3 by
Trans-B.
Trans-A is therefore operating
on some data that no longer
exists.
In second example, Trans-A
not only updates uncommitted
change at time T2, but it also
loses the update at time T3 –
because the rollback at T3
causes record P to be restored
to its value prior to T1.
137. 137
Transaction Management
Concurrency Problems
The Inconsistent Analysis Problem / Incorrect Summary -
ACC1 = 40 ACC2 = 50 ACC3=30
Trans-A Time Trans-B
Retrieve ACC1
Sum=40
T1
Retrieve ACC2
Sum=90
T2
T3 Retrieve ACC3
T4 Update ACC3 : 3020
T5 Retrieve ACC1
T6 Update ACC1 : 4050
T7 Commit
Retrieve ACC3
Sum=110, not 120
T8
Trans-A : Summing
balances of all accounts
Trans-B : Transfer an
amount 10, from ACC3 to
ACC1
Trans-A reads ACC1, 2
balances and adds them to
the sum.
Then Trans-B moves an
amount 10 from ACC3 to 1.
Then Trans-A reads ACC3
balance and updates sum.
The result 110 is incorrect.
The transaction A has seen
the database in an
inconsistent state and has
therefore performed an
INCONSISTENT ANALYSIS.
138. 138
Transaction Management
Concurrency Problems
The Phantom Record Problem - New record inserted into the
database that is needed by an in-
progress transaction may cause
incorrect result
This may happen because the new
record may be included in one query
but not the other.
Trans-A first counts the total number
of employees (e.g. 15)
Then Trans-B inserts a new
employee record
Then Trans-A calculates total of
salaries of all employees which
includes the salary of the new
employee also
Then Trans-A divides the total salary
by the count of employees which is
15 in Trans-A. This gives incorrect
average salary.
Trans-A Time Trans-B
Count No. of
employees
Count=15
T1
T2 Insert New
Employee
Calculate Total
salary (Includes
New employee)
T3
Calculate
Average=Total
Salary/Count(15)
T4
139. 139
Transaction Management
Locking
Locking – A Concurrency Control
Technique that lets a transaction acquire
a lock on the object of interest (typically,
a database record) so as to prevent other
transactions from changing it.
Types of Locks –
Shared Locks (S locks / Read locks)
Exclusive Locks (X locks / Write locks)
140. 140
Transaction Management
How does Locking work?
If transaction A holds an exclusive (X) lock
on record P, then a request from some other
distinct transaction B for a lock of either type
on P will be denied.
If transaction A holds a shared (S) lock on
record P, then:
A request from some distinct transaction
B for an X lock on P will be denied;
A request from some distinct transaction
B for an S lock on P will be granted (i.e.
B will now also hold an S lock on P)
X S _
X
S
_
N N Y
N Y Y
Y Y Y
Compatibility Matrix
Transaction
A
Transaction B
A transaction that wishes to retrieve a record must first acquire an S lock on
that record.
A transaction that wishes to update a record must first acquire an X lock on
that record. Alternatively, if it already holds an S lock on the record, then it
must promote that S lock to X level.
If a lock request from transaction B is denied, then it goes into a wait state. B
will wait until A’s lock is released.
X locks and S locks are held until end-of-transaction (COMMIT or ROLLBACK).
141. 141
Transaction Management
Concurrency Problems Revisited
The Lost Update Problem -
Trans-A Time Trans-B
Retrieve P
(acquires S lock on P)
T1
T2 Retrieve P
(acquires S lock on P)
Update P
(requests X lock on P)
Wait
Wait
T3
Wait
Wait
Wait
Wait
Wait
T4 Update P
(requests X lock on P)
Wait
Wait
Wait
Locking solves the
Lost update
problem but
introduces
another problem -
Deadlock
142. 142
Transaction Management
Concurrency Problems Revisited
The Uncommitted Dependency Problem
Trans-A Time Trans-B
T1 Update P
(acquires X lock on P)
Update P
(request X lock on P)
Wait
Wait
Wait
T2
T3 Commit/Rollback
(Release X lock on P)
Resume:Update P
(acquire X lock on P)
Trans-A waits till
transaction B
completes its
transaction.
Hence A sees
committed values
of P.
Locking
eliminates th
e Uncommitted
Dependency
Problem.
143. 143
Transaction Management
Concurrency Problems Revisited
The Inconsistent Analysis Problem -
ACC1 = 40 ACC2 = 50 ACC3=30
Trans-A Time Trans-B
Retrieve ACC1 : Sum=40
(acquire an S lock on ACC1)
T1
Retrieve ACC2 : Sum=90
(acquire an S lock on ACC2)
T2
T3 Retrieve ACC3
(acquire an S lock on ACC3)
T4 Update ACC3 : 3020
(acquire an X lock on ACC3)
T5 Retrieve ACC1
(acquire an S lock on ACC1)
T6 Update ACC1 : 4050
(request an X lock on ACC1)
Wait
Retrieve ACC3
(request S lock on ACC3)
Wait….
T7 Wait
Wait….
Locking solves the
Inconsistent
Analysis Problem
by forcing the
deadlock.
144. 144
Transaction Management
Concurrency Problems Revisited
Use index locking to keep new record from
being inserted.
The Transaction for calculating average salary
should do the following –
Lock(key_emp_index)
Do salary calculation
Count employees
Unlock(key_emp_index)
Calculate Average salary
The Phantom Record Problem -
145. 145
Transaction Management
Intent Locking
Also called ‘Preemptive Locking’
Locking Granularity – It refers to the size
of the objects that can be locked.
Field/Column
Row or Tuple
Table
Database
Granularity Tradeoffs – Concurrency Vs.
Overheads
Why Intent Locks ?
146. 146
Transaction Management
Types of Intent Locking
X and S locks are applicable to tables as well as records.
S - A transaction T can tolerate concurrent readers, but not
concurrent updaters
X – A transaction T cannot tolerate any concurrent access to table R
at all. T itself might or might not update individual records in R.
Additionally, there are 3 intent locks –
Intent Shared (IS) Lock – A transaction T intends to set S locks on
individual records in table R
Intent Exclusive (IX) Lock – A transaction T might update individual
records in table R and will therefore set X locks on those records
Shared Intent Exclusive (SIX) Lock – Combines S and IX; i.e. T can
tolerate concurrent readers, but not concurrent updaters, in table R,
plus T might update individual records in R and will therefore set X
locks on those records.
147. 147
Transaction Management
Intent Locking Protocol
X SIX IX S IS -
X N N N N N Y
SIX N N N N Y Y
IX N N Y N Y Y
S N N N Y Y Y
IS N Y Y Y Y Y
- Y Y Y Y Y Y
Compatibility Matrix
Acquiring an X lock on a given
table implicitly acquires an X lock
on all records of that table.
Acquiring an S or SIX lock on a
given table implicitly acquires an
S lock on all records of that table.
Before a transaction can acquire
an S on a given record, it must
first acquire an IS (or stronger)
lock on the table containing that
record.
Before a transaction can acquire an X lock on a given record, it must first
acquire an IX (or stronger) lock on the table containing that record.
Before a transaction can release a lock on a given table, it must first release
all locks it holds on all records in that table.
148. 150
Time stamping
Mechanism for serialization of
a set of transactions in the
chronological order of start
time of these transactions.
149. 151
Time stamping
A transaction will read only those
rows that have been updated by
an older transaction or else it will
roll back.
A transaction will update only
those rows that have been read
and updated by older transaction
otherwise it will roll back.
150. 152
Time stamping
Occurs when an older transaction
tries to read a value that is written
by a younger transaction.
Or when an older transaction tries
to modify(Write) a value already
read or written by a younger
transaction
Both of these attempts signify that
the older transaction was “too late”
in performing the required
operation.
151. 153
Time stamps
A data item X is associated with
two timestamps
Wx the largest timestamp value of
any transaction that was allowed
to write a value of X
Rx the largest timestamp value of
any transaction that was allowed
to read the current value of X
152. 154
Read request
A transaction Ta with the
timestamp value of ta issues a
read operation for the data item X
with the values
Rx Wx
Request succeeds if ta >= Wx
Request fails if ta < Wx
X
153. 155
Write request
A transaction Ta with the
timestamp value of ta issues a
write operation for the data-item X
Request succeeds if ta>=Wx and
ta>=Rx
X Rx Wx
155. 157
Recovery in Database Systems
A computer system like any other
mechanical or electrical device is
subject to failure.
How does a DBMS handle failures?
157. 159
Failure Types
Transaction local failures that are
detected by the application code itself
Transaction local failures that are not
explicitly handled by the application
code
System failures that affect all
transactions currently in progress
Media failures that damage the
database
159. 161
Storage Structures
Volatile Storage
does not survive system crash
Non-volatile storage
may survive system crash, but may
not survive media failure like disk
head crash
Stable Storage
Information is never lost through
replicating the information
160. 162
Data Access
The database system resides in non-
volatile memory and is made of fixed
length storage units called “BLOCKS”
Data transfer between the main memory
and the disk is done through blocks
blocks on disk - Physical Blocks
blocks in main memory - Buffer Blocks
161. 163
Recovery - Two phases
Actions taken during normal transaction
processing to ensure that enough
information exists to allow for recovery
from failure
Actions taken following a failure to
ensure database consistency and
transaction atomicity.
162. 164
The Log
Most important structure used
to recover databases.
Contains the before image and
after image of the data item
modified
163. 165
Log based Recovery
Maintain log of incremental updates
Updating the log before updating the
database (called as Write-Ahead Log
Rule)
Log has to be periodically purged
use any of the schemes
Deferred Update Scheme
Immediate Update Scheme
164. 166
Log record
Each of the log update record has the
following information
Transaction Identifier (Unique id of the
transaction)
Data Item Identifier (Unique id of data
item written)
Old Value (of data item prior to write)
New Value (of data item after write)
165. 167
Immediate Update Scheme
Allows the modifications to be output to
the database while the transaction is in
the ACTIVE state. (Uncommitted
modification)
Any WRITE(X) operation must be
preceded by writing the appropriate new
record in the log.
When transaction Ti commits, the record
<Ti> COMMIT is written to the log.
166. 168
Immediate Update Scheme
All log records must be written in the
stable storage before the WRITE
operation to the database.
The uses two recovery procedures :
UNDO and REDO
The UNDO and REDO must be
imdepotent
167. 169
Imdepotency Principle
Executing the UNDO / REDO operation
several times must be equivalent to
executing it once. This characteristic is
required to guarantee correct behaviour
even if the failure occurs during the
recovery process
(Korth pg. 519)
168. 170
Immediate Update Scheme
The transaction is undone, if the log
contains <Ti start> but does not contain
<Ti COMMIT>
The transaction need to be redone, if the
log contains both the records <Ti
START> and <Ti COMMIT>
169. 171
Immediate Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
C R A S H
On restart, <To COMMIT> is not
found, restore back to old values,
therefore UNDO(To) is performed
170. 172
Immediate Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
<To COMMIT>
<T1 START>
<T1, C, 2500, 3000> C R A S H
On restart, <To COMMIT> is found,
and <T1 COMMIT> is not found,
therefore REDO(To) and UNDO(T1)
are performed
171. 173
Immediate Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
<To COMMIT>
<T1 START>
<T1, C, 2500, 3000>
<T1 COMMIT> C R A S H
REDO(To) and REDO(T1) are
performed
172. 174
Deferred Update Scheme
Database modifications are written in the
log, but deferring the execution of all the
WRITE operations until a transaction
partially COMMITs
When the transaction partially COMMITs,
the information associated with it is used.
If a transaction aborts or system crashes
before a transaction completes, the
information on the log is IGNORED
173. 175
Deferred Update Scheme
When the transaction partially COMMITs,
its associated records from the log are
used to execute the deferred WRITE.
All the LOG records must be on a stable
storage. Once they have been written,
the actual updating starts and the
transaction enters the COMMITTED state.
174. 176
Deferred Update Scheme
If the system crashes before the entries
are made in the log, the entries are
IGNORED, so there is no need to UNDO
If the system crashes after the entries
are made in the log, when the system is
brought up again, the recovery system
consults the log only for those
transaction for which both <Ti START>
and <Ti COMMIT> exist.
Old values are not stored in the log
record.
175. 177
Deferred Update Scheme
READ(A) A=1000
A := A - 50 <To START>
WRITE(A) <To, A, 950>
READ(B) B=2000
B := B + 50
WRITE(B) <To, B, 2050>
COMMIT <To COMMIT> A=950, B=2050
SQL
STATEMENTS
RELATED LOG
ENTERIES
DATABASE
VALUES
176. 178
Deferred Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
C R A S H
On restart, <To COMMIT> is not
found, therefore no REDO(To) is
needed
177. 179
Deferred Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
<To COMMIT>
<T1 START>
<T1, C, 2500, 3000> C R A S H
On restart, <To COMMIT> is found,
and <T1 COMMIT> is not found,
therefore REDO(To) is performed
178. 180
Deferred Update Scheme
<To START>
<To, A, 1000, 950>
<To, B, 2000, 2050>
<To COMMIT>
<T1 START>
<T1, C, 2500, 3000>
<T1 COMMIT> C R A S H
As the crash occurs after <To
COMMIT> and <T1 COMMIT>, both
REDO(To) and REDO(T1) are
performed
179. 181
Checkpoints
When a system failure occurs, it is necessary
to consult the log, to identify which
transactions need to be undone and which
transactions need to be redone. There are two
difficulties in this:
The searching process is time consuming
Most transactions might have to be redone
which might have been updated in the
database. Redoing these transactions
causes recovery to take longer
180. 182
Checkpoints
To reduce these overheads,
CHECKPOINT are introduced.
The system regularly performs
checkpoints, that requires the following
sequence :
Output onto stable storage all the log
records currently residing in the main
memory
Output to the disk all the modified
buffer blocks
Output onto stable storage log record
checkpoint
181. 183
Checkpoints
The transactions are not allowed to
perform any updates while the
checkpoints are in progress.
The presence of a checkpoint allows the
system to streamline its recovery
process.
182. 184
Checkpoints
A transaction Ti has COMMITTED before
the checkpoint. <Ti COMMIT> appears
in the log before the checkpoint. Any
database modifications made by Ti
must have been written either prior to
or as a part of the checkpoint. Thus at
recovery time there is no need to REDO
the operation on Ti
The checkpoint helps us to define the
start of the recovery process.
183. 185
Checkpoints
At restart time, the system goes
through the following procedure :
Start with UNDO and REDO lists. Undo
will be equal to the list of all
transactions given in the most recent
checkpoint record, REDO to empty
Search forward from checkpoint
onwards
If BEGIN Ti found, add to it UNDO list
If COMMIT Tj is found, move Tj to
REDO list
184. 186
Checkpoints
When the end of the log is reached,
the UNDO list identify the transactions
of the type T3 and T5, and the REDO
list identify transactions of the type t2
and t4
185. 187
Checkpoints
The system now works backwards
through the log, undoing the
transactions in the UNDO list and the
works forwards, redoing the
transactions in the REDO list (Forward
Recovery)
The UNDO operation need not be
applied when the Deferred Update
scheme is applied
187. 189
Checkpoints
What happens to each of the
transactions ?
T1
T
2 T
3 T
4 T5
Checkpoint
Time Tc
System Failure,
Time Tf
REDO, as COMMIT before Tf
REDO
UNDO
UNDO
No effect, transaction complete
before Tc