ESOFT Metro Campus - Diploma in Software Engineering - (Module IV) Database Concepts
(Template - Virtusa Corporate)
Contents:
Introduction to Databases
Data
Information
Database
Database System
Database Applications
Evolution of Databases
Traditional Files Based Systems
Limitations in Traditional Files
The Database Approach
Advantages of Database Approach
Disadvantages of Database Approach
Database Management Systems
DBMS Functions
Database Architecture
ANSI-SPARC 3 Level Architecture
The Relational Data Model
What is a Relation?
Primary Key
Cardinality and Degree
Relationships
Foreign Key
Data Integrity
Data Dictionary
Database Design
Requirements Collection and analysis
Conceptual Design
Logical Design
Physical Design
Entity Relationship Model
A mini-world example
Entities
Relationships
ERD Notations
Cardinality
Optional Participation
Entities and Relationships
Attributes
Entity Relationship Diagram
Entities
ERD Showing Weak Entities
Super Type / Sub Type Relationships
Mapping ERD to Relational
Map Regular Entities
Map Weak Entities
Map Binary Relationships
Map Associated Entities
Map Unary Relationships
Map Ternary Relationships
Map Supertype/Subtype Relationships
Normalization
Advantages of Normalization
Disadvantages of Normalization
Normal Forms
Functional Dependency
Purchase Order Relation in 0NF
Purchase Order Relation in 1NF
Purchase Order Relations in 2NF
Purchase Order Relations in 3NF
Normalized Relations
BCNF – Boyce Codd Normal Form
Structured Query Language
What We Can Do with SQL ?
SQL Commands
SQL CREATE DATABASE
SQL CREATE TABLE
SQL DROP
SQL Constraints
SQL NOT NULL
SQL PRIMARY KEY
SQL CHECK
SQL FOREIGN KEY
SQL ALTER TABLE
SQL INSERT INTO
SQL INSERT INTO SELECT
SQL SELECT
SQL SELECT DISTINCT
SQL WHERE
SQL AND & OR
SQL ORDER BY
SQL UPDATE
SQL DELETE
SQL LIKE
SQL IN
SQL BETWEEN
SQL INNER JOIN
SQL LEFT JOIN
SQL RIGHT JOIN
SQL UNION
SQL AS
SQL Aggregate Functions
SQL Scalar functions
SQL GROUP BY
SQL HAVING
Database Administration
SQL Database Administration
1. Diploma in Software Engineering
Module IV: Database Concepts
Rasan Samarasinghe
ESOFT Computer Studies (pvt) Ltd.
No 68/1, Main Street, Pallegama, Embilipitiya.
2. Contents
1. Introduction to Databases
2. Data
3. Information
4. Database
5. Database System
6. Database Applications
7. Evolution of Databases
8. Traditional Files Based Systems
9. Limitations in Traditional Files
10. The Database Approach
11. Advantages of Database Approach
12. Disadvantages of Database Approach
13. Database Management Systems
14. DBMS Functions
15. Database Architecture
16. ANSI-SPARC 3 Level Architecture
17. The Relational Data Model
18. What is a Relation?
19. Primary Key
20. Cardinality and Degree
21. Relationships
22. Foreign Key
23. Data Integrity
24. Data Dictionary
25. Database Design
26. Requirements Collection and analysis
27. Conceptual Design
28. Logical Design
29. Physical Design
30. Entity Relationship Model
31. A mini-world example
32. Entities
33. Relationships
34. ERD Notations
35. Cardinality
36. Optional Participation
37. Entities and Relationships
38. Attributes
39. Entity Relationship Diagram
40. Entities
41. ERD Showing Weak Entities
42. Super Type / Sub Type Relationships
43. Mapping ERD to Relational
44. Map Regular Entities
45. Map Weak Entities
46. Map Binary Relationships
47. Map Associated Entities
48. Map Unary Relationships
3. Content
49. Map Ternary Relationships
50. Map Supertype/Subtype Relationships
51. Normalization
52. Advantages of Normalization
53. Disadvantages of Normalization
54. Normal Forms
55. Functional Dependency
56. Purchase Order Relation in 0NF
57. Purchase Order Relation in 1NF
58. Purchase Order Relations in 2NF
59. Purchase Order Relations in 3NF
60. Normalized Relations
61. BCNF – Boyce Codd Normal Form
62. Structured Query Language
63. What We Can Do with SQL ?
64. SQL Commands
65. SQL CREATE DATABASE
66. SQL CREATE TABLE
67. SQL DROP
68. SQL Constraints
69. SQL NOT NULL
70. SQL PRIMARY KEY
71. SQL CHECK
72. SQL FOREIGN KEY
73. SQL ALTER TABLE
74. SQL INSERT INTO
75. SQL INSERT INTO SELECT
76. SQL SELECT
77. SQL SELECT DISTINCT
78. SQL WHERE
79. SQL AND & OR
80. SQL ORDER BY
81. SQL UPDATE
82. SQL DELETE
83. SQL LIKE
84. SQL IN
85. SQL BETWEEN
86. SQL INNER JOIN
87. SQL LEFT JOIN
88. SQL RIGHT JOIN
89. SQL UNION
90. SQL AS
91. SQL Aggregate Functions
92. SQL Scalar functions
93. SQL GROUP BY
94. SQL HAVING
95. Database Administration
96. SQL Database Administration
5. Data
Data are numbers, characters, images or other
outputs from devices that is more suitable to
move or process. Data can be known as distinct
types of information.
6. Information
Information is a result of processing and
manipulating and organizing data that adds to
the knowledge of the person receiving it.
7. Database
Database is a collection of interrelated data
items that can be processed by one or more
application systems.
13. Limitations in Traditional Files
• Data Redundancy
• Inconsistent Data
• Inflexibility
• Limited Data Sharing
• Poor Data Control
• Security Problems
• Data Isolation
15. Advantages of Database Approach
• Minimal Data Redundancy
• Consistency of Data
• Flexibility
• Sharing of Data
• Data Control
• Proper Security
• Integration of Data
• Ease of Application Development
• Data Manipulation
16. Disadvantages of Database Approach
• Complexity
• Size
• Cost of DBMS
• Additional Hardware Cost
• Higher Impact of a Failure
• Cost of Conversion
17. Database Management Systems (DBMS)
DBMS is a software that enables users to define,
create, maintain and control the access to a
database.
18. DBMS Functions
A. Data Definition
B. Data Entry
C. Data Manipulation
D. Data Display
E. Data Security
F. Data Integrity
G. Backup and Recovery
21. ANSI-SPARC 3 Level Architecture
• External Schema
– Defines the external view of data
as seen by a particular user or
program
• Conceptual Schema
– Defines the logical view of the
data
as seen by all users and programs
• Internal Schema
– Defines the physical view of data
as seen by a DBMS
22. The Relational Data Model
Data elements are stored in different tables
made up of rows and columns. Relates data in
different tables through the use of common
data element(s).
23. What is a Relation?
Data is presented to the user as tables:
• Tables are comprised of rows and a fixed number of
named columns.
• Columns are attributes describing an entity. Each column
must have an unique name and a data type.
28. Primary Key
Each table has a primary key. The primary key is
a column or combination of columns that
uniquely identify each row of the table.
(Composite Key)
29. Cardinality and Degree
The cardinality of a table refers to the number of
rows in the table.
The degree of a table refers to the number of
columns.
33. Foreign Key
A foreign key is a set of columns in one table
that serve as the primary key in another table.
34. Data Integrity
Data Integrity refers to the validity of data.
Problems may encounter!
• Two employees with same NID, EmpNo?
• Employee who is 10 years or 70 years?
• Employee who does not work for you?
Solutions?
• Entity Integrity
• Domain Integrity
• Referential Integrity
35. Data Dictionary
A Data Dictionary is a file or a set of files that
contains a database's metadata.
Names of all tables and their owners.
Names of all indexes and the tables in those
indexes relate.
Constraints defined on tables.
36. Database Design
The database design process can be broken down
into four phases.
Requirements Collection and Analysis
Conceptual Design
Logical Design
Physical Design
37. Requirements Collection and analysis
Prospective database
uses are interviewed to
understand and
document their data
requirements.
39. Logical Design
This is the process of
mapping the database
structure developed in
the previous phase to a
particular database
model.
E.g. map E-R model to
relational
40. Physical Design
This is the process of
defining structure that
enables the database
to be queried in an
efficient manner.
41. Entity Relationship Model
• An Entity Relationship Model is a data model for describing
the data within databases or information systems.
• It’s a graphical representation of entities and their
relationships to each other.
42. A mini-world example
• A Company is organized in to departments.
• Each department has a number and an
employee who manages the department.
• We keep track of the start date when that
employee started managing the
department.
• A department may have several locations.
• A department controls a number of
projects. Each of which has a name, a
number and a single location.
43. A mini-world example cont’d
• We store each employee’s name, national Id
number, address, salary, birth date and sex.
• An employee is assigned to one
department, but may work on several
projects.
• We keep track of the number of hours per
week that an employee works on each
project.
• We also keep track of the direct supervisor
of each employee.
44. A mini-world example cont’d
• We keep track of the dependants of each
employee for insurance purposes.
• We keep each dependant’s name, sex,
birth date and relationship to the
employee.
Such information is gathered from the
mini-world to perform Phase 1 of
database design process.
79. Normalization
• In relational database design, the process of
organizing data to minimize redundancy.
• Normalization usually involves dividing a
database into two or more tables and defining
relationships between the tables.
80. Advantages of Normalization
Reduction of data redundancy within tables:
Reduce data storage space.
Reduce inconsistency of data.
Remove insert, update and delete anomalies.
Improve flexibility of the system.
81. Disadvantages of Normalization
Reduction in efficiency of certain data retrieval
as relations may be joined during retrieval.
• Increase join
• Increase use of indexes: storage (keys)
• Increase complexity of the system
82. Normal Forms
1NF any multi-valued attributes have been
removed
2NF any partial functional dependencies have
been removed
3NF any transitive dependencies have been
removed
BCNF any remaining anomalies that result
from functional dependencies have been
removed
83. Functional Dependency
Functional Dependency is a constraint between two
attributes or two sets of attributes
The functional dependency of B on A is represented
by an arrow: A → B
e.g.
NID → Name, Address, Birth date
VID → Model, Color
ISBN → Title, Author, Publisher
85. First Normal Form - 1NF
• No multi valued columns exists.
• All the key attributes are defined.
• All non-key attributes are fully functionally
dependent on the primary key.
88. 1NF - Actions Required
1. Examine for repeat groups of data
2. Remove repeat groups from relation
3. Create new relation(s) to include repeated
data
4. Include key of the 0NF to the new relation(s)
5. Determine key of the new relation(s)
93. Problems - 1NF
INSERT PROBLEM
Cannot know available parts until an order is placed
(e.g. P4 is bush)
DELETE PROBLEM
Loose information of part P7 if we cancel purchase order
115 (e.g. Delete PO-PART for Part No P7)
UPDATE PROBLEM:
To change description of Part P3 we need to change every
record in PO-PART containing Part No P3
94. Second Normal Form - 2NF
• Relations should not contain any partial
functional dependencies.
• E.g. No attribute is dependent on only a
partial of the primary key.
95. PO-PART Relation (Parts Ordered) in 1NF
Part Description is depended only on Part No,
which is part of the key of PO-PART.
97. 2NF - Actions Required
If entity has a concatenated key
1. Check each attribute against the whole key
2. Remove attribute and partial key to new
relation
3. Optimize relations - consider combining
tables that have identical primary keys
102. Problems - 2NF
INSERT PROBLEM
Cannot know available suppliers until an order is placed
(e.g. 200 is hardware stores)
DELETE PROBLEM
Loose information of supplier 100 if we cancel purchase
order 116 (e.g. Delete PO for Supplier No 100)
UPDATE PROBLEM
To change name of Supplier 222 we need to change every
record in PO containing Supplier No 222
103. Third Normal Form - 3NF
• No any transitive dependencies are exist.
• Transitive dependency is a functional
dependency between two or more non-key
attribute.
104. PO Relation in 2NF
Supplier name is a non-key field depended on
another non-key field (supplier no) in addition
to be depended on the key purchase order no.
106. 3NF - Actions Required
1. Check each non-key attribute for dependency
against other non-key fields
2. Remove attribute depended on another non-
key attribute from relation
3. Create new relation comprising the attribute
and non-key attribute which it depends on
4. Determine key of new relation
112. BCNF – Boyce Codd Normal Form
• Boyce Codd Normal Form is a higher version of the
Third Normal form.
• In BCNF Every determinant in table is a candidate
key.
A table that is in 3NF but not in BCNF
113. 3NF without BCNF
STU_ID STAFF_ID CLASS_CODE ENROLL_GRADE
125 25 21344 A
125 20 32456 C
135 20 28458 B
135 25 27563 C
144 20 32456 B
• Each Class_Code identifies a class uniquely.
• A student can take many classes.
• A staff member can teach many classes, but each
class is tought by only one staff.
114. 3NF without BCNF
PROBLEMS
• If a different member is assigned to teach class 32456 two
rows must be updated.
• Also if student 135 drops out we lose data on who teaches the
class.
STU_ID STAFF_ID CLASS_CODE ENROLL_GRADE
125 25 21344 A
125 20 32456 C
135 20 28458 B
135 25 27563 C
144 20 32456 B
115. BCNF – Boyce Codd Normal Form
STU_ID STAFF_ID ENROLL_IDCLASS_CODE
CLASS_CODESTU_ID ENROLL_ID CLASS_CODE ENROLL_ID
3NF but not in BCNF
3NF and BCNF
116. BCNF – Boyce Codd Normal Form
STU_ID CLASS_CODE ENROLL_GRADE
125 21344 A
125 32456 C
135 28458 B
135 27563 C
144 32456 B
CLASS_CODE STAFF_ID
21344 25
32456 20
28458 20
27563 25
117. Structured Query Language (SQL)
• SQL is using for storing, manipulating and
retrieving data stored in relational database.
• All relational DBMS like MySQL, MS Access,
Oracle, Sybase, Informix, postgres and SQL
Server uses SQL as standard database
language.
118. What We Can Do with SQL ?
Access data in relational DBMS.
Define the data in database
Manipulate the data in database.
Create and drop databases and tables.
Create view, stored procedure in a database.
Set permissions on tables, procedures, and
views.
128. SQL FOREIGN KEY
CREATE TABLE tblPayment
(
PaymentID int NOT NULL,
Amount varchar(255) NOT NULL,
PayedDate datetime,
StudentID int,
PRIMARY KEY (PaymentID),
FOREIGN KEY (StudentID) REFERENCES
tblStudent(StudentID)
);
129. SQL ALTER TABLE
ALTER TABLE tblStudent ADD DateOfBirth date;
ALTER TABLE tblStudent MODIFY COLUMN
DateOfBirth year;
ALTER TABLE tblStudent DROP COLUMN
DateOfBirth;
130. SQL INSERT INTO
INSERT INTO tblStudent (StudentID, FirstName,
LastName, Address, Phone) VALUES (1000,
'Thilina', 'Perera', 'Colombo, Sri Lanka',
'0777475323');
131. SQL INSERT INTO SELECT
INSERT INTO tblStudent (StudentID, FirstName,
LastName, Address, Phone)
SELECT EmpID, FirstName, LastName, Address,
Phone FROM tblEmployer;
132. SQL SELECT
SELECT * FROM tblStudent;
SELECT StudentId, FirstName FROM tblStudent;
135. SQL AND & OR
SELECT * FROM tblStudent WHERE
Address='Matara' OR Address='Colombo';
SELECT * FROM tblStudent WHERE
FirstName='Roshan' AND Address='Colombo';
136. SQL ORDER BY
SELECT * FROM tblStudent ORDER BY FirstName
DESC;
SELECT * FROM tblStudent ORDER BY FirstName
ASC;
142. SQL INNER JOIN
SELECT tblStudent.FirstName,
tblPayment.PaymentID FROM tblStudent INNER
JOIN tblPayment ON
tblStudent.StudentID=tblPayment.StudentID;
143. SQL LEFT JOIN
SELECT tblStudent.FirstName,
tblPayment.PaymentID FROM tblStudent LEFT
JOIN tblPayment ON
tblStudent.StudentID=tblPayment.StudentID;
144. SQL RIGHT JOIN
SELECT tblPayment.PaymentID,
tblStudent.FirstName FROM tblStudent RIGHT
JOIN tblPayment ON
tblStudent.StudentID=tblPayment.StudentID;
145. SQL UNION
SELECT FirstName FROM tblStudent UNION
SELECT FirstName FROM tblStudent2;
SELECT FirstName FROM tblStudent UNION ALL
SELECT FirstName FROM tblStudent2;
146. SQL AS (Aliases)
SELECT StudentID AS 'Student ID' FROM
tblStudent;
SELECT S.FirstName, P.PaymentID FROM
tblStudent AS S INNER JOIN tblPayment AS P ON
S.StudentID=P.StudentID;
147. SQL Aggregate Functions
SQL aggregate functions return a single value,
calculated from values in a column.
AVG() - Returns the average value
COUNT() - Returns the number of rows
FIRST() - Returns the first value
LAST() - Returns the last value
MAX() - Returns the largest value
MIN() - Returns the smallest value
SUM() - Returns the sum
148. SQL Scalar functions
SQL scalar functions return a single value, based
on the input value.
UCASE() - Converts a field to upper case
LCASE() - Converts a field to lower case
INITCAP() - Converts the first letter of a field to
upper case.
149. SQL GROUP BY
SELECT StudentID, SUM(Amount) FROM
tblPayment GROUP BY StudentID;
151. Database Administration
Installing and Upgrading
Database Security
Enrolling Users
Monitoring Activities
Optimizing the Performance
Producing Reports
Backup and Recovery
152. SQL Database Administration
GRANT SELECT, INSERT ON dbstudent.* TO
'silva'@'localhost' IDENTIFIED BY 'silva123';
REVOKE INSERT ON dbstudent.* FROM
'silva'@'localhost';
Record/Tuple – A row in a Relation
Field /Attribute – A column in a Relation
Domain – Set of values of an Attribute
Degree – The number of Fields in a Relation
Cardinality – the number of Records in a Relation
Null – the value not given or unknown for a field.
Can record data about a Department
even if there is NO Employees
assigned to it
Entity instances can exists on its own.
i.e. independent of other instances
Department data are not repeated for
all their employees
Avoids inconsistent problem
e.g. change of manager
Entity Integrity : There are no duplicate rows in a table.
Domain Integrity : Enforces valid entries for a given column by restricting the type, the format, or the range of values.
Referential integrity : Rows cannot be deleted, which are used by other records.
User-Defined Integrity : Enforces some specific business rules that do not fall into entity, domain, or referential integrity
Refers to the validity of data. Data integrity can be compromised in a number of ways:
Human errors when data is entered
Errors that occur when data is transmitted from one computer to another
Software bugs or viruses
Hardware malfunctions, such as disk crashes
Natural disasters, such as fires and floods
For instance, suppose that in a commercial bank's database, the administrator wants to determine which table holds information about loans. Making an educated guess that the table most likely has the word "LOAN" in it, he would issue the following query on the data dictionary (the first query is for an Oracle DB, while the second is for an SQL Server DB):
SELECT * FROM DBA_TABLES WHERE TABLE_NAME LIKE '%LOAN%';
SELECT * FROM SYSOBJECTS WHERE TYPE='U' AND NAME LIKE '%LOAN%';
Normalisation is a set of data design standards.
It is a process of decomposing unsatisfactory relations into smaller relations.
Like entity–relationship modelling were developed as part of database theory.
Boyce and Codd Normal Form is a higher version of the Third Normal form.
The ORDER BY keyword sorts the records in ascending order by default.
The IN operator allows you to specify multiple values in a WHERE clause.
The INNER JOIN keyword selects all rows from both tables as long as there is a match between the columns in both tables.
The SQL UNION operator combines the result of two or more SELECT statements.
(both statements should have same number of columns)
INITCAP() Not works in MySQL
The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
USE WITH GROUP BY
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.
Database administration refers to the set of activities performed to ensure that a database is always available as needed.
Go to:
C:\Program Files\MySQL\MySQL Server 5.6\bin
Enter:
Mysql –u root –p