DATABASE MANAGEMENT SYSTEM
(FOR BOTH CSE-4TH
Database Management System (DBMS)
DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
Purpose of Database Systems
In the early days, database applications were built directly on top of
Drawbacks of using file systems to store data:
• Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation — multiple files and formats
• Integrity problems
• Integrity constraints (e.g. account balance > 0) become “buried” in
program code rather than being stated explicitly
• Hard to add new constraints or change existing ones Databases touch all
aspects of our lives
Different abstract levels
- a widely accepted general architecture for a database
- database described by three abstract levels
- internal schema (physical database)
- conceptual schema (conceptual database)
- external schema (view) Objectives
- insulation of application programs and data
- support of multiple user views
- use of schema to store the DB description (mete-data)
The Three Schema Architecture :
- describes a subset of the database that a particular user group is interested
in, according to the format the format user wants, and hides the rest
- may contain virtual data that is derived from the files, but is not explicitly
stored Conceptual schema
- hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, operations, and constraints.
- describes the physical storage structure of the DB
- uses a low-level (physical) data model to describe the complete details of
data storage and access paths
EXTERNAL LEVEL (highest level)
̒The user’s view of the database.
̒Consists of a number of different external views of the DB.
̒Describes part of the DB for particular group of users.
̒Provides a powerful and flexible security mechanism by hiding parts of the
DB from certain users. The user is not aware of the existence of any attributes
that are missing from the view.
̒It permits users to access data in a way that is customized to their needs, so
that the same data can be seen by different users in different ways, at the
̒The logical structure of the entire database as seen by DBA.
̒What data is stored in the database?
̒The relationships among the data.
̒Complete view of the data requirements of the organization, independent of
any storage consideration.
- entities, attributes, relations
- constraints on data
- semantic information on data
- security, integrity information
Supports each external view: any data available to a user must be contained
in or derivable from the conceptual level.
̒Physical representation of the DB on the computer.
̒How the data is stored in the database.
̒Physical implementation of the DB to achieve optimal run time
performance and storage space utilization.
- Storage space allocation for data and indexes
- Record description for storage
- Record placement
- Data compression, encryption
Managed by the OS under the direction of the DBMS.
SCHEMAS, MAPPINGS, INSTANCES
DB schema: overall description of the DB.
Three different schemas according to the level of abstraction.
DBMS: mapping between schemas consistency of schemas
Conceptual/Internal Mapping: to find the actual record (combinations) in
physical storage that constitutes a logical Record in the conceptual schema.
External/Conceptual Mapping: map names in the user’s view
Onto the relevant part of the conceptual schema.
Instances and Schemas
Similar to types and variables in programming languages
Schema – the logical structure of the database
l Example: The database consists of information about a set of customers and
accounts and the relationship between them)
l Analogous to type information of a variable in a program
l Physical schema: database design at the physical level
l Logical schema: database design at the logical level
Instance – the actual content of the database at a particular point in time
l Analogous to the value of a variable
Database instance: the data in the DB at any particular point in time.
The ability to modify a scheme definition in one level without Affecting a
scheme definition in a higher level is called data independence.
1. There are two kinds:
Logical data independence
̒The ability to modify the conceptual scheme without causing application
programs to be rewritten.
̒Immunity of external schemas to changes in the conceptual schema.
̒Usually done when logical structure of database is altered
Physical data independence
̒The ability to modify the internal scheme without having to change the
conceptual or external schemas.
̒Modifications at this level are usually to improve performance.
Three Schema Architecture
Data and meta-data: - three schemas are only meta-data (descriptions of data)
- data actually exists only at the physical level Mapping
- DBMS must transform a request specified on an external schema into a
request against the conceptual schema, and then into the internal schema
- requires information in meta-data on how to accomplish the mapping
among various levels
- overhead (time-consuming) leading to inefficiencies
- few DBMSs have implemented the full three-schema architecture
Benefits of Three Schema Architecture:
Logical data independence
- The capacity to change the conceptual schema without having to change
external schema or application programs ex: Employee (E#, Name, Address,
and Salary) A view including only E# and Name is not affected by changes in
any other attributes.
Physical data independence
- the capacity to change the internal schema without having to change the
conceptual (or external) schema
- internal schema may change to improve the performance (e.g., creating
additional access structure)
- easier to achieve logical data independence, because application programs
are dependent on logical structures
- one fundamental characteristic of the database approach
- hides details of data storage that are not needed by most database users and
- a set of data structures and conceptual tools used to describe the structure of
a database (data types, relationships, and constraints)
- used in the definition of the conceptual, external, and internal schema
- must provide means for DB designers to represent the real-world
information completely and naturally.
High-level (conceptual) data models
- use concepts such as entities, attributes, relationships
- object-based models: ER model, OO model
Representational (implementation) data models
- most frequently used in commercial DBMSs
- record-based models: relational, hierarchical, network
Low-level (physical) data models
- to describe the details of how data is stored
- captures aspects of database system implementation: record structures
(fixed/variable length) and ordering, access paths (key indexing), etc.
Schemas and Instances
In any data model, it is important to distinguish between the description of
the database and the database itself. Database schema (meta-data)
- overall description of a database, specified by a set of definitions
- specified during database design (not change frequently)
- similar to the notion of type definition in programs Database instance
- current contents of the database (actual data): DB state
- may change frequently Distinction between database schema and database
- a database just specified (or defined) is in empty state
- initial state would be achieved when the data is loaded
- DBMS is responsible to ensure every database state is valid
Data Definition and Manipulation Languages
Data definition language (DDL)
- not a procedural language
- notations for describing the types of entities and relationships among
DDL statements --® data dictionary
Data manipulation language (DML)
- for accessing and modifying data
- non-procedural: specifying "what" to access
- procedural: specifying "what" and "how" to get
- non-procedural languages could be easy to use but may not be efficient
Coordinates all the activities of the database system; the database
administrator has a good understanding of the enterprise’s information
resources and needs.
• Database administrator's duties include:
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting user authority to access the database
• Specifying integrity constraints
• Acting as liaison with users
• Monitoring performance and responding to changes in requirements
Data models are a collection of conceptual tools for describing data, data
relationships, and data semantics and data constraints. Components: structural
part manipulative part integrity rules There are three different groups:
Object-based Data Models
Record-based Data Models
Physical Data Models
Describe data at the conceptual and external levels Describe data at the
Object-based Data Models
- Entity-relationship model.
- Object-oriented model.
- Semantic data model.
- Functional data model
Record-based Data Models
̒Named so because the database is structured in fixed format records of
̒Each record type defines a fixed number of fields, or attributes.
̒Each field is usually of a fixed length (this simplifies the implementation).
̒The three most widely accepted models are the
Relational, network, and hierarchical data model.
Physical Data Models
1. Are used to describe data at the lowest level.
2. Very few models, e.g.
• Unifying model.
• Frame memory.
- popular high-level conceptual model used in DB design
- proposed by P. Chen in 1976 (ACM TODS)
- Perception of real-world consisting of a collection of entities and
relationships among them OO model.
- DB is defined in terms of objects, their properties, and their operations
(methods) Relational model
- represents a DB as a collection of tables Network model
- represents DB as record types and 1:N relationships Hierarchical model
- represents data as hierarchical tree structures
An entity may be defined as a thing which is recognized as being capable of
an independent existence and which can be uniquely identified. An entity is
an abstraction from the complexities of some domain.
E-R Diagram Notation:
Entity Key attributes
Weak Entity Multivalue
Relationship between two entities
Rectangles represent entity sets.
-Diamonds represent relationship sets.
-Lines link attributes to entity sets and entity sets to relationship sets.
-Ellipses represent attributes
-Double ellipses represent multivalve attributes.
-Dashed ellipses denote derived attributes.
-Underline indicates primary key attributes (will study later)
ER Diagram With Composite, Multivalued, and Derived Attributes
Relationship Sets with Attributes:
In a many to one relationship a loan is associated with several (including
0) customers via borrower, a customer is associated with at most one
Loan via borrower.
Entities: Entity-a thing (animate or inanimate) of independent physical or
conceptual existence and distinguishable. In the University database context,
an individual student, faculty member, a class room, courses are entities.
Entity Set or Entity Type-Collection of entities all having the same
properties. Student entity set –collection of all student entities. Course entity
set –collection of all course entities
Attributes - Each entity is described by a set of attributes/properties.
Student entity Stud Name–name of the student.
Roll Number–the roll number of the student.
Sex–the gender of the student etc. All entities in an Entity set/type have the
same set of attributes
Types of Attributes
•Simple Attributes-having atomic or indivisible values.
example: Dept–a string Phone Number–an eight digit number
•Composite Attributes-having several components in the value.
example: Qualification with components (Degree Name, Year, University
•Derived Attributes-Attribute value is dependent on some other attribute.
Example: Age depends on Date Of Birth. So age is a derived attribute.
•Single-valued-having only one value rather than a set of values.
-for instance, Place Of Birth–single string value.
•Multi-valued -having a set of values rather than a single value.
-for instance, Courses Enrolled attribute for student Email Address attribute
for student Previous Degree attribute for student.
•Attributes can be:-simple single-valued, simple multi-valued, composite
single-valued or composite multi-valued
Diagrammatic Notation for Entities entity -rectangle attribute -ellipse
connected to rectangle multi-valued attribute -double ellipse composite
attribute -ellipse connected to ellipse derived attribute-dashed ellipse
•When two or more entities are associated with each other, we have an
instance of a Relationship.
•E.g.: student Ramesh enrolls in Discrete Mathematics course
•Relationship enrolls has Student and Course as the participating entity sets.
•Tuples in enrolls–relationship instances
•enrolls is called a relationship Type/Set.
ER Diagram for a Banking Enterprise:
•One-to-one: An E1entity may be associated with at most one E2 entity and
similarly an E2entity may be associated with at most one E1 entity.
•One-to-many: An E1entity may be associated with many E2entities whereas
an E2entity may be associated with at most one E1entity
.•Many-to-one: …( similar to above)
•Many-to-many: Many E1entities may be associated with a single E2 entity
and a single E1entity may be associated with many E2entities
Primary Key: The primary key of a relational table uniquely identifies each
record in the table. It can either be a normal attribute that is guaranteed to be
unique (such as Social Security Number in a table with no more than one
record per person) or it can be generated by the DBMS.
Examples: Imagine we have a STUDENTS table that contains a record for
each student at a university. The student's unique student ID number would
be a good choice for a primary key in the STUDENTS table. The student's
first and last name would not be a good choice, as there is always the chance
that more than one student might have the same name.
Foreign Key:A foreign key is a field in a relational table that matches the
primary key column of another table. The foreign key can be used to cross-
Referential integrity is a database concept that ensures that relationships
between tables remain consistent. When one table has a foreign key to
another table, the concept of referential integrity states that you may not add a
record to the table that contains the foreign key unless there is a
corresponding record in the linked table. It also includes the techniques
known as cascading update and cascading delete, which ensure that changes
made to the linked table are reflected in the primary table.
Referential integrity enforces the following three rules:
1. We may not add a record to the Employees table unless the Managed By
attributes points to a valid record in the Managers table.
2. If the primary key for a record in the Managers table changes, all
corresponding records in the Employees table must be modified using a
3. If a record in the Managers table is deleted, all corresponding records in
the Employees table must be deleted using a cascading delete.
Importance of Referential Integrity
By providing specification of columns within a referencing table that are
foreign keys for columns in some other referenced table, referential integrity
is a reliable mechanism which prevents accidental database corruptions when
doing inserts, updates, and deletes. It states that a row cannot exist in a table
with a non-null value for a referencing column if an equal value does not
exist in a referenced column.
Once we define the employee_number as a foreign key in employee_phone
relation, if we try to insert a row with a primary key value that does not exist
in employee table, the system will not allow this insertion.
The following summarize the benefits of referential integrity:
a) Ensure data integrity and consistency base on primary key and foreign key
b) Increases development productivity, because it is not necessary to code
SQL statements to enforce referential constraints, the Teradata RDBMS
automatically enforces referential integrity.
A candidate key is a combination of attributes that can be uniquely used to
identify a database record without any extraneous data. Each table may have
one or more candidate keys. One of these candidate keys is selected as the
table primary key.
A check constraint (also known as table check constraint) is a condition that
defines valid data when adding or updating an entry in a table of a relational
database. A check constraint is applied to each row in the table. The
constraint must be a predicate. It can refer to a single or multiple columns of
the table. The result of the predicate can be TRUE, FALSE, or UNKNOWN,
depending on the presence of NULLs. If the predicate evaluates to
UNKNOWN, then the constraint is not violated and the row can be inserted
or updated in the table.
Each check constraint has to be defined in the CREATE TABLE or ALTER
TABLE statement using the syntax:
1. CREATE TABLE table_name (...,CONSTRAINT constraint_name CHECK (
predicate ), ... )
2. ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK (
NOT NULL Constraint:
A NOT NULL constraint is functionally equivalent to the following check constraint
with an IS NOT NULL predicate:CHECK (column IS NOT NULL)
A domain is defined as the set of all unique values permitted for an attribute.
For example, a domain of date is the set of all possible valid dates, a domain
of integer is all possible whole numbers, a domain of day-of-week is
Monday, Tuesday ... Sunday.
This in effect is defining rules for a particular attribute. If it is determined that
an attribute is a date then it should be implemented in the database to prevent
invalid dates being entered.
A domain of possible values should be associated with every attribute. These
domain constraints are the most basic form of integrity constraint. They are
easy to test for when data is entered.
1. Attributes may have the same domain, e.g. cname and employee-name.
2. It is not as clear whether bname and cname domains ought to be distinct.
3. At the implementation level, they are both character strings.
4. At the conceptual level, we do not expect customers to have the same
names as branches, in general.
5. Strong typing of domains allows us to test for values inserted, and
whether queries make sense. Newer systems, particularly object-oriented
database systems, offer a rich set of domain types that can be extended
Users are differentiated by the way they expect to interact with
-Application programmers – interact with system through DML calls
-Sophisticated users – form requests in a database query language
-Specialized users – write specialized database applications that do
not fit into the traditional data processing framework
-Naïve users – invoke one of the permanent application programs that
have been written previously
l Examples, people accessing database over the web, bank tellers,clerical staff
n Coordinates all the activities of the database system; the
database administrator has a good understanding of the Enterprise’s
information resources and needs.
Database administrator's duties include:
-Storage structure and access method definition
-Schema and physical organization modification
-Granting user authority to access the database
-Specifying integrity constraints
-Acting as liaison with users
-Monitoring performance and responding to changes in
-A super key of an entity set is a set of one or more attributes
whose values uniquely determine each entity.
-A candidate key of an entity set is a minimal super key
l Customer_id is candidate key of customer
l account_number is candidate key of account
-Although several candidate keys may exist, one of the candidate
keys is selected to be the primary key.