SlideShare a Scribd company logo
1 of 99
Download to read offline
Introduction to DBMS
By Dr. Kamal Gulati
For more Notes check at www.mybigdataanalytics.in
Table of Contents
1. INTRODUCTION....................................................................................................... 4
1.1. DBMS Definitions ................................................................................................... 6
1.1.1. Database........................................................................................................ 6
1.1.2. DBMS ........................................................................................................... 6
1.1.3. Database system............................................................................................ 6
1.2. Components of database .......................................................................................... 6
1.2.1. Database administrator (DBA) ..................................................................... 6
1.2.2. Database designer ......................................................................................... 6
1.2.3. End users....................................................................................................... 6
1.3. Advantages of DBMS.............................................................................................. 7
1.4. Disadvantage in File Processing System ................................................................. 7
2. DATA MODELS........................................................................................................ 8
2.1. Categories of data models........................................................................................ 8
2.2. Schemas and instances............................................................................................. 8
2.3. DBMS architecture .................................................................................................. 9
2.4. Data independence................................................................................................. 10
2.4.1. Logical data independence.......................................................................... 10
2.4.2. Physical data independence ........................................................................ 10
2.5. Classification of database management system..................................................... 10
2.5.1. Relational data model ................................................................................. 10
2.5.2. Network data model.................................................................................... 10
2.5.3. Hierarchal data model................................................................................. 11
2.5.4. Object oriented data model......................................................................... 11
2.6. Database languages and interfaces......................................................................... 11
2.6.1. DBMS languages ........................................................................................ 11
2.6.2. DBMS interfaces......................................................................................... 12
2.7. Database system environment................................................................................ 13
2.7.1. Data manager: ............................................................................................. 13
2.7.2. DDL compiler............................................................................................. 13
2.7.3. Run-time database processor ...................................................................... 14
2.7.4. Query compiler ........................................................................................... 14
2.7.5. Pre-compiler................................................................................................ 14
2.8. Entity Relationship Model..................................................................................... 14
2.8.1. Entities and attributes.................................................................................. 14
2.8.2. Entity types, entity sets, keys and values sets............................................. 15
2.8.3. Relationship types, sets and instances ........................................................ 16
2.8.4. Notations for ER diagram........................................................................... 18
2.8.5. Generalization............................................................................................. 19
2.8.6. Aggregation................................................................................................. 21
3. RELETIONAL MODEL .......................................................................................... 21
3.1. Characteristics of relation ...................................................................................... 22
3.2. Operations of the relation model............................................................................ 22
3.3. Relational algebra operation .................................................................................. 23
3.4. Set theoretic operation ........................................................................................... 23
3.4.1. Union........................................................................................................... 23
3.4.2. Intersection.................................................................................................. 24
3.4.3. Set difference .............................................................................................. 24
3.4.4. Join operation.............................................................................................. 24
3.4.5. Division operation....................................................................................... 25
3.4.6. Aggregate function...................................................................................... 25
3.4.7. COUNT....................................................................................................... 25
3.4.8. Grouping ..................................................................................................... 25
3.4.9. Recursive closure operation........................................................................ 25
3.4.10. Outer join .................................................................................................... 26
3.5. Tuple relational calculus........................................................................................ 26
3.5.1. Expression and formulas in tuples calculus................................................ 27
3.5.2. Existence and universal quantifier.............................................................. 28
3.5.3. Rules for the definition of a formula........................................................... 28
3.6. Transforming the universal and existential quantifier ........................................... 29
3.6.1. Domain relational calculus ......................................................................... 29
4. Database Design........................................................................................................ 30
4.1. Schema Refinement ............................................................................................... 30
4.1.1. Guidelines for relation schema ................................................................... 30
4.2. Functional Dependencies....................................................................................... 31
4.2.1. Interference rules for Functional Dependencies ......................................... 32
4.2.2. Axioms to check if FD holds ...................................................................... 32
4.2.3. An Algorithm to Compute Attribute Closure X+ with respect to F ........... 33
4.3. NORMALIZATION.............................................................................................. 33
4.3.1. Basics of normal forms ............................................................................... 33
4.4. Inclusive dependency............................................................................................. 41
5. TRANSACTION MANAGEMENT ........................................................................ 42
5.1. Transaction Concept .............................................................................................. 42
5.2. Transaction state .................................................................................................... 43
5.3. Implementation of atomicity and durability .......................................................... 44
5.4. Concurrent Execution ............................................................................................ 44
5.5. Schedule................................................................................................................. 45
5.6. Serializability......................................................................................................... 46
5.6.1. Conflict Serializability................................................................................ 46
5.6.2. View Serializability .................................................................................... 48
5.7. Recoverability........................................................................................................ 48
5.7.1. Recoverable schedule.................................................................................. 48
5.7.2. Cascade less schedule ................................................................................. 49
5.8. Testing for Serializability ...................................................................................... 49
5.9. Precedence graph ................................................................................................... 50
6. Concurrency control.................................................................................................. 51
6.1. Lock based protocols ............................................................................................. 51
6.1.1. Locks........................................................................................................... 51
6.1.2. Granting of locks......................................................................................... 54
6.1.3. Avoiding starvation of transaction by granting locks................................. 54
6.2. Two phase locking protocol................................................................................... 54
6.3. Graph based protocol............................................................................................. 55
6.4. Time-stamp based protocol.................................................................................... 55
6.5. Validation based protocol ...................................................................................... 56
6.6. Recovery system .................................................................................................... 56
6.6.1. Failure Classification .................................................................................. 56
6.6.2. Log based recovery:.................................................................................... 57
6.7. Deferred Database Modification............................................................................ 58
6.8. Immediate Database Modification......................................................................... 58
7. Centralized and Distributed Database....................................................................... 59
7.1. Distributed Database System................................................................................. 59
7.2. Some advantages of the DDBMS are as follows:.................................................. 59
7.3. Some additional properties: ................................................................................... 60
7.4. Physical hardware level ......................................................................................... 60
7.5. Client Server Architecture ..................................................................................... 61
7.6. Data fragmentation................................................................................................. 62
7.6.1. Horizontal fragmentation............................................................................ 62
7.6.2. Vertical fragmentation ................................................................................ 62
7.6.3. Mixed fragmentation................................................................................... 62
7.7. Data Replication..................................................................................................... 62
7.8. Deadlock handling ................................................................................................. 63
7.8.1. Deadlock prevention ................................................................................... 63
7.8.2. Deadlock detection and recovery................................................................ 64
8. SQL (Structured Query Language)........................................................................... 66
8.1. DDL Statements..................................................................................................... 66
8.1.1. Implicit commits......................................................................................... 67
8.1.2. Data dictionary............................................................................................ 67
8.2. DML....................................................................................................................... 68
8.3. Language Structure ................................................................................................ 68
8.4. Basic SQL Queries................................................................................................. 68
8.4.1. SQL data statements ................................................................................... 69
8.4.2. SQL-Transaction Statements ...................................................................... 72
8.4.3. SQL-Schema Statements ............................................................................ 72
8.5. Union, Intersect and Except................................................................................... 75
8.5.1. ALL............................................................................................................. 76
8.6. Cursors................................................................................................................... 79
8.6.1. Explicit Cursors .......................................................................................... 79
8.6.2. Implicit Cursors .......................................................................................... 80
8.7. Triggers.................................................................................................................. 81
8.7.1. Creating Triggers ........................................................................................ 81
8.8. Dynamic SQL ........................................................................................................ 82
9. QBE........................................................................................................................... 83
10. Query Processing and Optimization ...................................................................... 83
10.1. Query Processing ................................................................................................. 83
10.2. Query Optimizing ................................................................................................ 85
10.3. Indexes ................................................................................................................. 85
10.4. Selectivities.......................................................................................................... 86
10.5. Uniformity............................................................................................................ 86
10.6. Disjunctive Clauses.............................................................................................. 87
10.7. Join Selectivities .................................................................................................. 88
10.8. Views ................................................................................................................... 89
11. OODBMS .............................................................................................................. 90
11.1. Characteristics of Object-Oriented Database....................................................... 90
11.2. Advantage of OODBMS...................................................................................... 91
11.3. Disadvantage of OODBMS ................................................................................. 92
12. ORACLE................................................................................................................ 92
12.1. Storage ................................................................................................................. 92
12.2. Database Schema ................................................................................................. 92
12.3. Memory architecture............................................................................................ 93
12.3.1. Library cache .............................................................................................. 93
12.3.2. Data dictionary cache.................................................................................. 94
12.3.3. Program Global Area .................................................................................. 94
12.4. Configuration....................................................................................................... 95
13. Objective Questions............................................................................................... 95
1. INTRODUCTION
A Database Management System (DBMS) is a set of computer programs that
controls the creation, maintenance, and the use of the database of an
organization and its end users. It allows organizations to place control of
organization-wide database development in the hands of database
administrators (DBAs) and other specialists. DBMSes may use any of a variety of
database models, such as the network model or relational model. In large
systems, a DBMS allows users and other software to store and retrieve data in a
structured way. It helps to specify the logical organization for a database and
access and use the information within a database. It provides facilities for
controlling data access, enforcing data integrity, managing concurrency
controlled, and restoring database.
The first DBMS appeared during the 1960's at a time in human history where
projects of momentous scale were being contemplated, planned and engineered.
Never before had such large datasets been assembled in this new technology.
Problems on the floor were identified and solutions were researched and
developed - often in real-time.
The DBMS became necessary because the data was far more volatile than had
earlier been planned, and because there were still major limiting factors in the
costs associated with data storage media. Data grew as a collection, and it also
needed to be managed at a detailed transaction by transaction level. In the
1980's all the major vendors of hardware systems large enough to support the
evolving needs of evolving computerized record keeping systems of larger
organizations, bundled some form of DBMS with their system solution.
The first DBMS species were thus very much vendor specific. IBM as usual led
the field, but there were a growing number of competitors and clones whose
database solutions offered varying entry points into the bandwagon of
computerized record keeping systems.
1.1. DBMS Definitions
Some of the technical terms of DBMS are defined as below:
1.1.1. Database
A database is a logically coherent collection of data with some inherent meaning,
representing some aspect of real world and which is designed, built and
populated with data for a specific purpose. Ex: consider the name, telephone
number, and addresses.
You can record this data in an indexed address book. For maintain database we
generally use such software DBASE IV, Ms-Access or Excel
1.1.2. DBMS
It is a collection of programs that enables user to create and maintain a
database. In other words it is general-purpose software that provides the users
with the processes of defining, constructing and manipulating the database for
various applications.
1.1.3. Database system
The database and DBMS software together is called as Database system.
1.2. Components of database
1.2.1. Database administrator (DBA)
In many organizations where many persons use the same resources, there is a
need for a chief administrator to manage these resources.
In a database environment, the primary resource is the database itself and the
secondary resource is the DBMS and the related software. To manage these
resources, we need the database administrator.
DBA is responsible for authorizing access to the database and for acquiring S/W
and H/W resource as needed.
1.2.2. Database designer
They are responsible for identifying the data to be stored in the database and for
choosing appropriate structure to represent and store this data. The responsibility
of the database designer is to communicate with the database user and to
understand their requirement.
1.2.3. End users
These are the persons whose jobs requires to access to the database for
querying, updating and generating the reports. The databases generally exist for
their use.
There are several categories of end users:
A. Casual end users: who occasionally access the database but they need
different information each time.
B. Parametric end user: make up a sizable portion of the database end user
their main job function involves constantly querying and updating the
database. By using standard types of queries and updates called canned
transaction tat have been carefully programmed and tested. Such as bank
tellers' checks accounts balances, withdraws and deposits.
C. Sophisticated end users: includes engineers, scientist, and business
analyst who toughly familiarize with the facilities of the DBMS so as to
implement their application to meet the complex requirement.
D. Stand alone end users: maintains personal database by using
readymade software that provide easy to use menu or graphical based
interface. Ex: tax packages that store a variety of personal financial data
for tax purpose.
E. System analyst and application programmer: System analyst
determines the requirement of the end users, especially parametric end
users and develops specification for the canned transaction to meet their
requirement. Application programmer implements these specifications as
programs then they test, debug document and maintain these canned
transaction. These programmers are known as software engineer.
1.3. Advantages of DBMS
1. Controlling redundancy
2. Restricting unauthorized access
3. Providing persistent storage for program object and data structure
4. Database interfacing
5. Providing multiple user interface
6. Presenting complex relationship among data
7. Enforcing integrity constraints
8. Providing backup and recovery
1.4. Disadvantage in File Processing System
1. Data redundancy & inconsistency.
2. Difficult in accessing data.
3. Data isolation.
4. Data integrity.
5. Concurrent access is not possible.
6. Security Problems.
2. DATA MODELS
Data model is a set of concepts that can be used to describe the structure of the
data base.
By the structure of the database as data type, relationship and constraints that
should hold for the data. Most of the data items also include a set of basic
operations for specifying the modification on the data.
2.1. Categories of data models
A. High level or conceptual data model: that describe how the user will
use the database. High-level data model uses concepts such as entities,
attributes and relationship.
B. Entity: represents real world objects such as employee or project that is
stored in the database.
C. Attribute: represents some property of interest that further describes the
entity such as employee name or salary.
D. Relationship: it represents the relationship between two or more entity.
Low level or physical data model: that describe how the data is stored in
the computer.
E. Representational or implementation data model: it hides some of the
details of data storage but can be implemented on a computer system in a
direct way.
2.2. Schemas and instances
The description of the database is called the database schema. The database
schema is specified during the database design. The displayed schema is called
a schema diagram and is not change frequently.
The actual data in the database may change frequently. In a data base changes
occur every time. We add a new student or entry a new grade for a student. The
data in the database at the particular moment of time is called the database state
or instance or snapshot.
2.3. DBMS architecture
Three important characteristics of the database
1. Insulation of program and data
2. Support of multiple user view
3. Use of catalogue to store the database schema
The architecture of the database system is called as three- schema architecture
1. Internal schema
2. Conceptual schema
3. External schema
1. Internal schema: it describes the physical storage structure of the
database. The internal schema uses a physical data model and describes
the complete details of data storage and access path for the database.
2. Conceptual schema: it describes the structure of a whole database for a
community of users. The conceptual schema hides the details of physical
storage structure. High-level data model or an implementation data model
can be used at this level.
3. External schema: it describes the part of the database that a particular
user group is interested in and hides the rest of the database from that
user group.
2.4. Data independence
Three schema architecture can be used to explain the concepts of data
independence which can be defines the capacity to change the schema at one
level of the database system without change the schema at the next higher level.
There are two types of data independence:
2.4.1. Logical data independence
This is the capacity to change the conceptual schema without having to change
external schema or application programs. We can change the conceptual
schema to expand the database or to reduce the database.
2.4.2. Physical data independence
This is the capacity to change the internal schema without having to change the
conceptual or external schema. Changes to the external schema may be needed
because some physical files have to be reorganized. Ex: by creating additional
access structure to improve the performance of retrieval or updates.
2.5. Classification of database management system
We can categorize the DBMS as follows:
1. Relational data model
2. Network data model
3. Hierarchal data model
4. Object oriented data model
2.5.1. Relational data model
Relational data model represents a database as a collection of tables where
each table is stored as a separate file. Most relational database has high level
query language and support a limited form of users view.
2.5.2. Network data model
Represent data as a record type and also represent a limited type of 1:N
relationship, called a set of types.
2.5.3. Hierarchal data model
It represents data as hierarchal tree structure. Each hierarchy represents a
number of related records. There is no standard language for hierarchal model.
2.5.4. Object oriented data model
It define a database in term of objects their properties and their operations.
Objects with the same structure and behavior belong to a class and classes are
organized into a hierarchy and cyclic graph.
2.6. Database languages and interfaces
2.6.1. DBMS languages
The first thing is to specify conceptual and internal schema for the database and
any mapping between two. In many DBMS where no strict separation of levels is
maintained one language called the data definition language (DDL) is used by
the DBA and data base designer to define both schemas.
In DBMS, there is a DDL compiler, whose function is to process DDL statements
in order to identify the description of the schema constructs and to store the
schema description in the DBMS catalog. Where the clear separation of
• Conceptual schema
• Internal schema
A. Then DDL is used to specify conceptual schema only.
B. SDL (storage definition language) is used to specify internal
schema only.
Mapping between two levels is specifying by the any of the two languages. In
some DBMS VDL (view definition language) is used to specify the users view
and their mapping to the conceptual schema. But in most DBMS, DDL is used to
specify both conceptual and external schema. Once the database schema is
created and database is filled with data. Users must have to manipulate the
database. Manipulations include:
• Retrieval
• Insertion
• Deletion
• Modification
For that purpose DBMS provides DML (database manipulation language).
2.6.1.1. DML database manipulation language
There are two main type of DML’s:
1. High-level or nonprocedural DML( SQL)
2. Low level or procedural DML
1. High-level or nonprocedural DML: can be used to specify complex
database operations. Many DBMS allows high-level DML statement either
to be entered interactively from a terminal or to be embedded in a general
purpose programming language. DML statement must be identified within
the program so that they can be extracted by a pre-compiler and
processed by the DBMS. High-level DML such as SQL can be specify and
retrieve many records in a single DML statement and hence are called
set-at-a-time or set-oriented DML’s.
2. Low level or procedural DML: must be embedded in a general purpose
programming language. This type of DML typically retrieves individual
records or objects from the database and processes each separately.
Hence it needs to use programming language, such as looping, to retrieve
and process each record from a set of records. Low-level DML are also
called record-at-a-time DML because of this property. Whenever DML
commands, high/low level are embedded in a general purpose
programming language that language is called the host language and the
DML is called the data sub language. On the other hand, high level DML
used in a stand-alone interactive manner is called a query language.
2.6.2. DBMS interfaces
User friendly interfaces provided by a DBMS may include the following:
Menu based interfaces: these interfaces present the user within list of options,
called menus, which lead the user through the formulation of a request. The
query is composed step-by-step by picking option from a menu that is displayed
by the system.
Forms based interface: a form-based interface display a form to each users.
Users can file out all of the form entries to insert new data or they file only certain
entries. Forms are actually designed and programmed for parametric end users.
Graphical user interface: GUI displays a schema to the user. User can then
specify a query by manipulating the diagram. Most GUI uses a pointing device as
mouse to pick up the certain part of the displayed schema.
Natural language interface: natural language interface refers to the world in its
schema as well as a set of standard word to interpret the request. If the
interpretation is successful, the interface generate a high level query
corresponding to the natural language request and submit it to the DBMS for
processing.
Interfaces for parametric users: parametric users, such as bank teller, often
have a small set of operations that they must perform repeatedly. System analyst
and programmer designed and implement a special interface for parametric user.
They generate keys by which that command automatically runs.
Interfaces for the DBA: the DBA staff uses these interfaces. These commands
are for creating accounts, setting system parameters, granting account
authorization, changing a schema and reorganizing the storage structure of a
database.
2.7. Database system environment
The database and the DBMS catalog are usually stored on the disk. Access to
the disk is controlled primarily by the operating system, which schedules disk
input/ outputs.
2.7.1. Data manager:
Modules of the DBMS controls:
A. Access to the DBMS information i.e. Stored on the disk.
B. It uses some basic OS services for carrying out low level data transfer
between the disk and computer main storage.
C. Handling buffers in the main memory.
2.7.2. DDL compiler
It processes schema definition specified in the DDL. The stored description of the
schemas in the DBMS catalog DBMS catalog: includes the following information
• Name of the files
• Data items
• Storage details of each file
• Mapping information
2.7.3. Run-time database processor
It handles database accesses. It receives retrieval or updates operations and
carries them to the database. Access to the disk goes through the stored data
manager.
2.7.4. Query compiler
Handles high level queries that are entered interactively and then generates calls
to the run time processors for executing the codes.
2.7.5. Pre-compiler
Extracts DML commands from an application program written in a host language.
Then commands send to the DML compiler for compilation of object code.
2.8. Entity Relationship Model
For designing a successful database application there are two terms that play
major role in the designing of database application:
• Database application
• Application program
Database application: refers to a particular database (bank database) and
associate program implements the queries and updates.
Example: program that’s implements database updates corresponding to
customers. Making deposits and withdraws these program provides user friendly
graphical interfaces (GUI’s) utilizing forms and the testing of these application
program.
2.8.1. Entities and attributes
Entities: the basic object that the ER model represents is an entity. The entity
may be an object with a physical existence –a particular person, car, house or
employee or it may be an object with conceptual existence – a company, a job or
a universally course.
Attribute: a particular property that describes the entity.
Ex: entity –employee may be describe by the employee’s name, age, address,
salary and job.
Composite attribute: composite attribute can be divided into the subparts which
represents more basic attributes with independent meaning.
Simple or atomic attribute: Attributes that are not divisible are called simple or
atomic attribute.
Single valued attributes: most attribute have a single value for a particular
entity, such attribute are called single valued attribute. Ex: (age) single valued
attribute for person.
Multi valued attributes: the attributes, which may have more than one value.
Colors attributes of a car. Car with one color have a single value where cars may
have multiple values. Such attributes are called multi-valued attributes.
Stored attributes: in some cases two attributes values are related. Ex: age and
birth date of person. The value of an age can be determined by the current data
and the value of the person’s birth date. The age attribute is called the derived
attribute and the birth date is called the stored attribute.
Null values: in some cases a particular entity may not have appropriate value for
an attribute. Ex: apartment number
Complex attribute: we represent composite attribute between parenthesis () and
separating the components by commas. Multi valued attributes by { }. Such
attributes are called complex attribute.
{Address phone ({phone (area code, phone number)})}
2.8.2. Entity types, entity sets, keys and values sets
Entity types: an entity types defined a collection ( or sets ) of entities that have
the same attributes. Each entity type in the database is described by its name
and attributes.
Entity sets: the collection of all entities of a particular entity type in the database
at any point in time is called an entity sets.
Key attributes: an entity type usually have key attribute whose values are
distinct for each individual entity in the collection. Such an attribute is called the
key attribute.
Values sets (domain of attribute) each simple attribute of an entity type is
associated with a value set (or domain of value), which specify the set of values
that may be assigned to that attribute for each individual entity.
Ex: employee
Age specify in the range 16 to 70.
2.8.3. Relationship types, sets and instances
An association among entities is called a relationship. Relationship type R among
n entity types E1, E2……………..En defines a set of associations. In another
word, the attribute set R is a set of relationship instances.
Degree of relationship type: The degree of relationship type is the number of
participated entry types.
Ex: work for relationship is of degree two.
Degree two- binary relationship
Degree three - ternary relationship
Role name: each entity type that participates in a relationship type plays a
particular role in relationship. The role name specify the role that a particular
entity from the entity play in each relationship instances and helps to explain
what the relationship means.
Recursive relationship: Role name is not important where all the participating
entity type is distinct, since each entity type name can be used as the role name.
In some cases, some entity type participates in more than one in a relationship
type in different roles. In such cases role name becomes essential for
distinguishes the meaning of each participation. Such relationship types are
called recursive relationships.
Employee and supervisor entities are the member of the same employee entity
types.
Weak entity type: The entity types that do not have key attribute are called weak
entity type. Weak entity type some times called the child entity type.
Regular/ strong entity type: that have key attribute are called the regular or
strong entity type. Identifying entity type is also some time called the parent entity
type or dominant entity type.
2.8.4. Notations for ER diagram
Symbols Meaning
2.8.5. Generalization
We think of a reverse process of abstraction in which we suppress the
differences among several entity type, identifying their common features and
generalize them into a single super class.
2.8.6. Aggregation
Aggregation is an abstraction concept for building composite object from their
component objects. There are calls where this concept can be used and related
to EER module.
• Where we aggregate attribute value of an object to form the whole object.
• When we represent an aggregate relationship as an ordinary relationship.
• Combining objects that are related by a particular relation instances.
3. RELETIONAL MODEL
The relational model represents the database as a collection of relations.
Relation is thought of as a table of values, each row in the table represents a
collection of related data values.
In relational model, each row in the table corresponds to entity or relationship. In
a relational model concept, a row is called a tuples, columns are called as
attributes, and the table is called a relation.
The data type describing the type of values that can appear in each column is
called a domain.
Domain:
The domain D is a set of atomic values. Atomic means that each value in the
domain is indivisible.
USA_phone_number- 10 .digits
Relation schemas:
R is denoted as R(A1,A2,A3.......An)
R is the relation name
Ai attributes for I=1,2,3,.....n
Student (name, SSN, home phone, address, office phone, age)
3.1. Characteristics of relation
1. Ordering of tuple in relation: a relation is defined as a set of tuple. Tuples
in a relation do not have any specific meaning.
2. Ordering of values within a type: n-type is an ordered list of n- values, so
ordering of value in a type. Attributes values are with in types of order.
3. Values in the tuples: each value in a tuple is a atomic value. I.e. it is not
divisible into its components. In a relational model concepts composite
and multi valued attributes are not allowed.
4. Interpretation of relation: the relation schema can be interpreted as a
declared or type of assertion.
Relational constraints: in this relational constraints we will study about the
restrictions apply on the database schema. These includes
Domain constraints: it specifies that the value of each attribute must be atomic
value.
Key constraints: a relation is defined as a set of tuples. All elements of sets are
distinct. Hence all tuples in the relation must be distinct. No two tuple can have
the same combination of all their attribute values.
Entity integrity constraints: no primary key value can be null, because it is
used to identify the individual tuples n a relation.
Referential integrity constraints: it is specified between two relations and is
used to maintain the consistency among tuples of the two relations. It is based on
the foreign key concepts.
3.2. Operations of the relation model
Operations on the relational model can be categorized into retrieval and updates.
There are three basic updates operations on relations.
Insert operation: it provides a list of attributes for a new tuple t that can be
inserted into a relation R.
Delete operation: it is used to delete a tuple from a relation if the tuple is being
deleted as referenced by the foreign key from other tuple in the database. We
use condition to delete the tuple.
Ex: delete from employee
Where SSN=. 985676;
Update operation: is used to change the value of one or more attribute in a tuple
of relation R.
Ex: update employee
Set age=.25.
Where SSN=.576787;
3.3. Relational algebra operation
1. Select operation: is used to select the subset of the tuples from a relation that
specify a selection condition or it is used to select some of the row from a
relation.
2. Project operation: it is used to select some of the column (set of attribute)
from a relation.
3. Rename operation: which is used to rename either relation name or attribute
names or both.
Rename (old table name) to (new table name)
3.4. Set theoretic operation
Several set theoretic operations are used to merge the elements of two sets in
various ways. These operations are as follows.
3.4.1. Union
The result of this operation is denoted by the R U S, is a relation that includes all
tuples that are either in R or in S or in both R and S. Duplicate tuples are
eliminated.
R U S = S U R {commutative operation}
Select salesman 'ID", name
From sales_master
Where city =.mumbai.
union
Select client "ID" , name
From client_master
Where city =.mumbai.;
3.4.1.1. Restrictions on using an union operation is as follows
1. The number of column in all the queries should be same.
2. The data type of the column in each query must be the same.
3. Union cannot be used in the sub query.
4. Aggregate function cannot be used with union clause.
3.4.2. Intersection
The result of this operation is denoted by RП S is a relation that include all tuples
that are in both R and S.
Select salesman "ID", name
From sales_master
Where city =.mumbai.
intersect
Select client "ID", name
From client_master
Where city =.mumbai.;
3.4.3. Set difference
The result of this operation is denoted by R.S, is a relation that includes all tuples
that are in in R but not in S.
Selecr product_no from product_master
Minus
Select product_no from sales_order;
3.4.4. Join operation
Denoted by X, is used to combine related tuples from two relations into a single
tuple. This operation is very important because it allow us to process relationship
among relations.
R X (join condition) S
There are some categories of join operations.
1. Cartesian product (cross product) or (cross Join); the main difference
between the Cartesian product and join, in join, only combination of tuples
satisfy the join condition appear in the result.
2. equi join: where only comparison operator is used =, is called the equi
join. Each pair of attributes with identical value is spurious. Removal of
spurious tuples is followed by natural join R * S.
3.4.5. Division operation
Division operation if used for special kind of query that sometimes occurs in
database application.
3.4.6. Aggregate function
On collection of values from the database, these functions are as follows:
SUM, AVERAGE, MAX, MIN
3.4.7. COUNT
This function is used to count tuples and attributes.
3.4.8. Grouping
This is used to group the attribute of any relation.
Select company, sum (amount) from sales
Group by company
Having sum (amount) > 10,000;
3.4.9. Recursive closure operation
This operation is used a recursive relationship.
3.4.10. Outer join
Natural join is denoted by R * S
Where R, S are relations
Only tuples from R that have matching tuple in S will be selected in the result and
without matched tuples are eliminated. Null tuples also eliminated.
A set of operations, called outer join can be used when we want to keep all the
tuples in R and S or in both. The relations whether they match or not.
Outer join is used to take the union of tuples of twp relations, if the relation is not
union compatible. Then they are called partially compatible. Only some of their
attributes are union compatible. This type of attribute has a key for both the
relation.
Left outer join:
R =>< S
Keeps every tuple or R, if no matching found in S, then S have null values.
Right outer join:
R ><= S
Full outer join R=><=S
{If no match found set null value in the tuple}
Outer union:
Student (name, SSN, department, advisor)
Faculty (name, SSN, department, Rank)
Result (name, SSN, department, advisor, Rank)
All the tuples of both the relation will appear in the result.
3.5. Tuple relational calculus
Relational calculus is formal query language. When we write one declarative
expression to specify a relation request and hence there is no description how to
evaluate the query.
Tuple relational calculus is based on specifying a number of tuple variables.
Variables may take as its value any individual tuple from that relation. A simple
tuple relational calculus queries is of the form
{t | cond(t)}
Result will display the set of all tuple t that satisfy cond(t).
Ex: find all employees whose salary > 50,000.
{t|employee(t) and t.salary>50,000}
This notation resembles how attributes name are qualified with relation names.
{t.fname,t.lname|employee(t) and t.salary>50,000}
select t.fname, t.lname
from employee as t
where t.salary >50,000;
3.5.1. Expression and formulas in tuples calculus
A general expression of the tuple relational calculus of the form
{t1.a1, t2.a2.....tn.an | cond(t1.t2.t3.t4........tn)}
Where
t1.a1, t2.a2.....tn.an → tuple variable
Ai is an attribute of the relation on which ti ranges.
Cond → is a condition or formula
Formula:
Formula is made up of predicate calculus atoms which can be one of the followings.
1. An atom of the form R(ti)
where R → relation name
ti → tuple variable
R(ti) → identifies the range of the tuple variable ti as the relation whose name
is R
2. An atom of the form ti. A op tj.B
where op → comparison operator
set = { > < >= <= #}
ti and tj are tuple variable
A → attribute of the relation on which ti ranges
B → attribute of the relation on which tj ranges
3. An atom of the form ti.A op c or c op tj.B
where op → comparison operator
ti and tj are tuple variable
A → attribute of the relation on which ti ranges
B → attribute of the relation on which tj ranges
C → constant value
A formula is made up of one or more atoms connected via the logical operator
AND, OR, NOT, and is defined as follows
1. Every atom is a formula.
2. If f1 & f2 are formulas, then so are ( f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT
(f2)
3. The truth values of these formulas are derived from their component formulas f1
and f2 as follows.
a. (f1 AND f2) is true if both f1 and f2 are true.
b. ( f1 OR f2)is false , if both f1 and f2 are false otherwise true
c. NOT (f1) is true if f1 is false, it is false if f1 is true
d. NOT (f2) is true if f2 is false, it is false if f2 is true
3.5.2. Existence and universal quantifier
Two special symbols called quantifier can appear in formulas, there are
1. Universal quantifier
2. Existential quantifier
Firstly we need to define the concept of free and bound tuples in formulas.
Bound: a tuple variable t is bound if it is quantified meaning that it appear in an
and
Free: otherwise it is free.
We can define the tuple variable in a formula as free and bound according to the
following rule.
1. An occurrence of a tuple variable in a formula F that is an atom is free in
F.
2. An occurrence of a tuple variable t is free or bound in formula made up of logical
connectives. (f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT (f2) depending on whether it
is free or bound in f1 and f2. a tuple variable may be free or bound either in f1 or in f2.
3. All free occurrence of a tuple variable t in f are bound in a formula f of the form.
F.= ( f) or
F.= (F)
The tuple variables are quantifier specified in f.
F1= d.dname=.research.
F2= ( d.dname=t.DNO)
F3= ( d.mgrssn=.12345677)
Tuple variable d is free in both f1 & f2. where it is found to the universal quantifier
in f3.
t→ is bound to the quantifier in f2.
3.5.3. Rules for the definition of a formula
1. if f is formula then so
is ( f)
where t→ tuple variable
the formula ( f) is true
if the formula f evaluates to true some ( at least one) tuple assigned to free
occurrence of t in f. otherwise ( f) is false.
2. if f is a formula , then so is (F)
where t→ tuple variable
The formula (F) is true, if formula f evaluates to true for every tuple (in the
universe) assigned to free occurrence to t in f.
otherwise (F) is false.
Note: quantifier called the existential quantifier because a formula
(f) is true , if there exist some tuples that makes f true.
quantifier called the universal quantifier (F) is true for every possible
tuple.
3.6. Transforming the universal and existential quantifier
We now use some of the transformation from mathematical logic that states the
universal and existential quantifier. It is possible to transform a universal
quantifier into an existential quantifier and vise-versa.
3.6.1. Domain relational calculus
There is another type of relational calculus called the domain relational calculus
or simply domain calculus. The QBE language related to domain calculus. The
specification of domain calculus was proposed after the development of QBE
language.
Domain calculus is differing from the tuple calculus in the type of variable used in
the formula.
An expression of the domain calculus is of the form
{X1, x2.....xn+1.....xn+m) | cond(x1,x2.....xn+1.....xn+m}
where
X1, x2.....xn+1.....xn+m are domain variable that ranges domain of attributes.
Cond= is the condition or formula of the domain relational calculus.
A formula is made up of atoms. A formula can be one of the followings.
1. An atom of the form R(x1, x2....xj)
R → name of relation of degree
And each
Xi 1<= I<=j is a domain variable
2. An atom of the form xi op xj
where op → comparison operator in the set
3. An atom is of the form xi op c or c op xj
where op → comparison operator in the set
Xi and xj are domain variable
c → constant value
4. DATABASE DESIGN
Conceptual database design gives us a set of relational schemas and integrity
constraints (ICs) that can be regarded as a good starting point for the final
database design. This initial design must be refined by taking the ICs in to
account more fully than is possible with just the ER model constructs and also by
considering performance criteria and typical workloads.
We concentrate on an important class of constraints called functional
dependencies. Other kind of ICs, for example, multi-valued dependencies and
join dependencies, also provide useful information. They can sometimes reveal
redundancies that cannot be detected using functional dependencies alone.
4.1. Schema Refinement
Redundant storage of information is the root cause of problems. Although
decomposition can eliminate redundancy; it can lead to problems of its own and
should be used with caution.
4.1.1. Guidelines for relation schema
1. Semantics of the attributes: every attributes in the relation must belong to
the relation as we know; relation is a collection of attributes and having a
meaning. Semantics means, how the attribute values in a tuple relate to one
another.
Example: (ename, ssn, bdate, address, dnumber)
Each attribute give the information about employees.
2. Redundant information in the tuples:
For the best use of free space, we disallow the redundant tuples from a relation.
For this we use some anomalies.
• Insert Anomalies
• Deletion Anomalies
• Modification Anomalies
3. Reducing null values in a tuple:
Because this can waste space at the storage level and may create a problem
with under standing the meaning of the attribute. Null values can have multiple
interpretations.
• Attributes values does not apply.
• Attribute values are not known for a tuple.
• Value is known but has not been recorded yet.
4. Spurious tuples:
Spurious tuples are those tuples which give the wrong information. The spurious
tuples are marked by asterisks (*).
Example:
Emp_loc (ename, plocation)
Emp_proj (ssn, pno, hours, pname, plocation)
4.2. Functional Dependencies
A functional dependency is denoted by X → Y between two sets of attributes
X and Y.
For any two tuples t1 and t2 in r
T1[X]=t2[X]
We must also have T1[Y]=t2[Y]
This means that the value of Y component of a tuple is depend on, or determine
by the value of X components or vise-versa.
X → called the left hand side of the FD
Y → called the right hand of the FD
X functionally determines the Y in a relation R if and only if whenever two
tuples of r(R) agree on their x value and agree on y values.
1. X is a candidate key. Because the key constraints imply that not two
tuples will have the same value of X.
2. if X Y in R, this does not say whether or not Y → X in R.
4.2.1. Interference rules for Functional Dependencies
Set of functional dependency is denoted by F that is specified on relational
schema R. it is impossible to specify all possible functional dependencies that
may hold. The set of all such dependency is called the closure of F and is
denoted by F+.
F={ssn → {ename,bdate,address,dnumber},
Dnumber → {dname,dmgrssn}}
Ssn → {dname,dmgrssn},
Ssn → ssn,
Dnumber → dname
F+ is also known as infer dependency. To determine a systematic way to infer,
we use inference rules. F=x→y is used to denote that the functional dependency.
X→Y is inferred from the set of FD of F.
4.2.2. Axioms to check if FD holds
4.2.3. An Algorithm to Compute Attribute Closure X+ with respect to F
Let X be a subset of the attributes of a relation R and F be the set of functional
dependencies that hold for R.
1. Create a hyper graph in which the nodes are the attributes of the relation
in question.
2. Create hyperlinks for all functional dependencies in F.
3. Mark all attributes belonging to X
4. Recursively continue marking unmarked attributes of the hyper graph that
can be reached by a hyperlink with all ingoing edges being marked.
Result: X+ is the set of attributes that have been marked by this process.
4.2.3.1. Hyper graph for F
4.3. NORMALIZATION
4.3.1. Basics of normal forms
A set of functional dependencies is specified for each relation, the process which
is top-down fashion and decomposing relation as necessary. Initially codd(1972)
proposed 1NF,2NF,3NF. The stronger definition of 3NF is boyce-codd normal
form proposed be Boyce and codd.
All these normal forms are based on the FD of a relation. After some time, 4NF &
5NF were proposed based on the concept of multi-valued dependencies and join
dependency.
4.3.1.1. 1NF (first normal form)
It was defined to disallow multi-value and composite attribute and their
combination. It states that the domain of an attribute must include only atomic
value. Values of any attributes in a tuple must be a single value from the domain
of that attribute.
4.3.1.2. 2NF (second normal form)
Second normal form is based on the concept of full functional dependency. A FD
(X→Y) if full functional dependency. If removal of any attribute A from x means
that the dependency does not hold any more.
i.e A x
(X→{A}) does not determine Y. X→Y is partial dependency if some attribute
removes from x.
4.3.1.3. 3NF (third normal form)
It is based on the concept of transitive dependency.
FD x→Y in a relation schema R is transitive dependency.
There is a attribute z→ neither candidate key not a subset of any key of R.
X→Z
Z→Y
Dependency hold.
4.3.1.4. Boyce-Codd normal form (BCNF)
Boyce-Codd normal form (or BCNF) is a normal form used in database
normalization. It is a slightly stronger version of the third normal form (3NF). A
table is in Boyce-Codd normal form if and only if, for every one of its non-trivial
functional dependencies X → Y, X is a superkey - that is, X is either a candidate
key or a superset thereof.
Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF
table which does not have multiple overlapping candidate keys is guaranteed to
be in BCNF. Depending on what its functional dependencies are, a 3NF table
with two or more overlapping candidate keys may or may not be in BCNF.
An example of a 3NF table that does not meet BCNF is:
Today's Court Bookings
Court Start Time End Time Rate Type
1 09:30 10:30 SAVER
1 11:00 12:00 SAVER
1 14:00 15:30 STANDARD
2 10:00 11:30 PREMIUM-B
2 11:30 13:30 PREMIUM-B
2 15:00 16:30 PREMIUM-A
• Each row in the table represents a court booking at a tennis club that has
one hard court (Court 1) and one grass court (Court 2)
• A booking is defined by its Court and the period for which the Court is
reserved
• Additionally, each booking has a Rate Type associated with it. There are
four distinct rate types:
• SAVER, for Court 1 bookings made by members
• STANDARD, for Court 1 bookings made by non-members
• PREMIUM-A, for Court 2 bookings made by members
• PREMIUM-B, for Court 2 bookings made by non-members
4.3.1.5. Algorithm for relational database
For a database, a universal relation schema R=(A1,A2,……….An) that include all
the attribute of the database. In this universal relation assumption, this states that
every attribute name is unique. A set of functional dependency that should hold
on the attribute or R specified by the database designers. Using functional
dependency, the algorithms decompose the universal relation schema R into a
set of relation schema D=(R1,R2,………….Rm)
D= relational database schema (D is called a decomposition of R)
We must sure that each attribute in R will appear in at least one relation schema
Ri in the decomposition, so that no attribute are lost.
I=1UmRi=R
R = {R1UR2UR3………………….Rm}
This is called attribute preservation condition of decomposition.
4.3.1.6. Decomposition and dependency preservation
If each functional dependency X→Y specified in F appears directly in one of the
relation schemas Ri in the decomposition D or could be inferred from the
dependencies that appears in some Ri. This is the dependency preservation
condition.
We want to preserve the dependency because each dependency in F represents
constraints on the database. That is needed to join two or more relations.
Suppose that a relation R is given and a set of functional dependency F.
F+ is the closure of F.
Decomposition D = {R1, R2………………Rm} of R is dependency preservation
with respect to F.
4.3.1.7. Decomposition and lossless (non-additive) joins
Another property a decomposition D should process in the loss-less join or non-
additive join property. Which ensure that no spurious tuples are generated, when
a normal join operation is applied to the relation in the decomposition. The
condition of no spurious tuples should hold on every legal relation state. Every
relation satisfies the functional dependency in F.
A decomposition D={R1,R2………………Rm} of R has the loss-less (non-
additive) join property with respect to the set of dependency F of R. if every
relation state r of R that satisfy F. where * is the natural join of all the relation in
D.
Word loss in loss-less refers to the loss of information, not loss of tuples. If
decomposition does not have loss-less join property. We may get additional
spurious tuples.
4.3.1.8. Multi-valued dependencies and fourth normal forms
In this section we will study about multi-valued dependency. That is a
consequence of first normal form (1NF), which allowed an attribute in a tuple to
have a set of values. For multi-valued attribute, we repeat every value of one of
the attribute with every value of the other attribute to keep the relation state
consistent. This constraint is specified by a multi-valued dependency.
An employee may work on several projects and several dependent. But project
and dependent are independent to each other. To make the relation consistent,
we must have a separate tuple to represent every combination of an employee’s
dependent and employee project. This constraint is specified as multi-valued
dependency.
4.3.1.8.1. Inference rules for functional and multi-valued dependency
To develop inference rule that includes both FD’s and MVD’s, so that both types
of constraints can be considered together.
Inference rules IR1 through IR8 form a complete set for FD’s and MVD’s from a
given set of dependency.
R={A1,A2……………….Am} and X,Y,Z,W are subset of R.
4.3.1.8.2. Fourth normal forms
A relation schema R is in 4NF respect to a set of dependency F (that includes FD
and MVD) if, for every MVD’s X→→Y in F+, X is a super key for R.
4.3.1.9. Loss-less join decomposition
4.3.1.10. Join decomposition and fifth normal form
Join dependency (JD), denoted by JD (R1, R2……………..Rn) specified on
relation schema R, specifies constraints on state r of R.
The constraints state that every legal state r of R should have a loss-less join
decomposition into R1, R2……………..Rn.
A join decomposition JD (R1, R2……………..Rn) specified on relation schema R
is a trival JD if one of the relation schema Ri in JD (R1, R2……………..Rn) is
equal to R. such a dependency is called trival dependency because it has the
loss-less join property. For any relation state r of R and hence does not specify
any constraints on R.
4.3.1.10.1. Fifth normal forms (Project join normal form)
A relation schema R is in 5NF or project join normal form (PJNF) with respect to
a set F of functional , multi-valued dependency JD (R1, R2……………..Rn) in F+
(i.e. implies by F), every Ri is a super key of R.
Example:
4.4. Inclusive dependency
Inclusion dependency was defined in order to formalize certain interrogational
constraints.
Example:
Foreign key constraints cannot specify as FD’s or MVD’s. Because it relates
attributes across relations. It can be specified as inclusive dependencies.
Inclusive dependency is also used to represent the constraints between two
relations.
An inclusive dependency
R.X<S.Y between two relation (set of attributes)
X of relation R
And Y of relation S
x of R and y of S must have the same number of attributes.
Example:
If X = {A1, A2……………….An}
And
Y = {B1, B2…………………Bn}
Where
1<=I<=n
Ai corresponds to Bi.
Inference rules for inclusive dependency.
1. IDIR1 reflexive rule R.X<R.X
2. IDIR2 Attribute correspondence
If
R.X<S.Y
where
X={A1,A2……………….An}
And
Y={B1,B2…………………Bn}
Ai corresponds to Bi.
R.Ai<S.Bi for 1<=i<=n
3. IDIR3 transitive rule
If R.X<S.Y
And
S.Y<T.Z
Then
R.X<T.Z
All the inclusion dependency represents referential integrity constraints.
5. TRANSACTION MANAGEMENT
5.1. Transaction Concept
A transaction is a unit of program that access and possibly updates various data
items. A transaction usually results from the execution of a user program written
in high level language or data manipulation language or any other programming
language.
Example: SQL, COBOL, C, PASCAL
And is determine by statements or system calls of the form begin transaction and
end transaction.
The transaction consist of all the operation between begin and end. To ensuring
the integrity of data, we require that the database must maintain the following
properties.
1. Atomicity: either all operation of the transaction is reflected property in
database or none.
2. Consistency: execution of a transaction in isolation (i.e. no other
transaction execution concurrently) preserve the consistency of the
database.
3. Isolation: even through multiple transactions can execute concurrently.
Ti & Tj set of transactions
Ti→ execution finished
Tj→ start execution
4. 4. Durability: after a transaction complete successfully, the changes it
has made to the database persist, even if there is a system failure.
These properties are called as ACID properties.
Access to the database accomplished by the following two operations.
1. Read(X): which transfer the data item X from the database to local buffer
belonging to the transaction that execute the read operation.
2. Write(X): that execute the write back to the database.
Example:
Ti that transfer $50 from account A to account B.
Ti:
READ(A)
A:=A-50;
WRITE(A)
READ(B)
B:=B+50;
WRITE (B)
Initial value of A and b are 1000$ and 2000$. Suppose the system failure occurs
after the write (A) and before .Then the account information
A=950$
B=2000$
5.2. Transaction state
Compensating transaction: to undo the effect of committed transaction is to
execute a compensating transaction.
We establish a simple abstract transaction model transaction must be in the
following states.
Active: the initial state, the transaction stays in this state while executing.
Partially committed: after the final statement has been executed.
Failed: after the discovery that normal executing can no longer proceed.
Aborted: after the transaction has been rolled back and the database has
been restored the prior state.
Committed: after successful completion.
A transaction enters the failed state after the system determines that the
transaction can no longer proceed with its normal execution.
Example:
Hardware or logical errors, such as a transaction must be rolled back, then
entered the aborted state, at this point system has two options.
1. Restart the transaction: hardware or software error
2. Kill the transaction: internal logical error that can be correct only by
rewriting or because due to the bad input.
5.3. Implementation of atomicity and durability
Recovery management component of a database system implements the
support of atomicity and durability.
Shadow-database scheme: transaction that wants to update on the database,
first create the complete copy of the database. All updates are done into the new
copy of the database, leaving the original copy, called the shadow copy.
If at any time, transaction has to be aborted, the new copy deleted. The old copy
of the database is unaffected. If transaction completes, operating system asks
write all the new copy on to the disk.
In UNIX operating system FLUSH command is used. After the FLUSH has
completed db_pointer, now points to the current copy of the database.
5.4. Concurrent Execution
A database system must control the interaction among the concurrent transaction
to ensure consistency of the database. In this section, we focus on the concept of
concurrent execution.
Example:
Consider the set of transaction that access and updates the bank account.
Let T1 and T2 be two transactions.
T1:
READ(A)
A:=A-50;
WRITE(A)
READ(B)
B:=B+50;
WRITE(B)
T2:
READ(A)
TEMP:=A*0.1;
A:=A-TEMP;
WRITE(A)
READ(B)
B:=B+TEMP;
WRITE(B)
Initial value of A and b are 1000$ and 2000$.
CASE1.
If T1 followed by T2
A=855$
B=2145$
CASE2
If T2 followed by T1
A=850$
B=2150$
5.5. Schedule
Execution sequences are called as schedules that show the order of transaction
execution. These schedules are called serial schedule. Each serial schedule
consists of a sequence of instruction from the various transactions, where the
instruction belonging to the single transaction appears together in the execution.
If two transactions are running concurrently, the CPU switches between the two
transactions or shared among all the transaction.
Final value of A and B are A=855$, B=2145$
Some of the schedules leave the database in inconsistence state.
Consider the example:
Final value of A and B are A=900$, B=2150$. Here we gained 50$.
5.6. Serializability
The database system must control execution of concurrent transaction to ensure
that the database system remains consistent. Then we first understand which will
ensure consistency and which schedule will not.
Generally transaction performs two operations.
I. Read operation
II. Write operation
A transaction performs this sequence of operations on the copy of Q that is
residing in the local buffer of the transaction. Here we will discuss different forms
of schedules.
I. Conflict Serializability
II. View Serializability
5.6.1. Conflict Serializability
Consider a schedule S that consist two consecutive transactions Ti and Tj.
Where Ii and Ij are instructions respectively (I≠j)
1. If Ii and Ij refer to the different data items then we can swap Ii and Ij
without affecting the result of any instruction in the schedule.
2. If Ii and Ij refer to the same data items then the order of two steps may
matters. Here we are dealing with two operation read operation and write
operation.
a. Ii=READ(Q), Ij=READ(Q) order does not matter. Because the same
value of Q is read by both ( Ti and Tj)
b. Ii=READ(Q), Ij=WRITE(Q) order will matter.
c. Ii=WRITE(Q), Ij=READ(Q) order will matter.
d. Ii=WRITE(Q), Ij=WRITE(Q)
Since both instructions are write operation. The order of this instruction does not
affect Ti and Tj. But the value obtained by the next read (Q) instruction of S is
affected.
We sat that Ii and Ij conflict if they are operation by different transaction on the
same data item, and at least one of these instructions is a write operation.
Serial schedule is defined as the all the instruction of any transaction executes
together.
If a schedule S can be transformed into a schedule S’ by a series of swaps of no-
conflicting instruction, we say that S and s’ are conflict equivalent.
The concepts of conflict equivalent leads to the concepts of conflict Serializability,
we say that a schedule S is conflict Serializable if it is conflict equivalent to a
serial schedule.
Such analysis is hard to implement and computationally expensive. We will
consider one such definition.
5.6.2. View Serializability
It is similar to conflict Serializability and based on the only read and write
operations of transactions. Consider the two schedule S and S’, where the same
set of transaction participates in both schedule. The schedule and S’ are said to
view Serializability, is they satisfy the following three conditions:
1. 1.for each data item Q, if transaction Ti reads the initial value of Q in
schedule S, then the transaction Ti must be in schedule S’, also read the
initial value of Q.
2. for each data item Q, if transaction Ti executes the read (Q) in schedule S
and that the value was produced by transaction Tj, then transaction Ti
must be in schedule S’ also read the value of Q that was produced by the
Tj.
3. for each data items Q, the transaction that performs the final write (Q)
operation in schedule S must performs the final write(Q) operation in
schedule S’.
5.7. Recoverability
Still we are discussing about which schedule will ensure the consistency of the
database and which will not. With assuming that there is no transaction failure
now, we address the effect of transaction failure during concurrent execution.
Transaction Ti→ that fails, for what ever reason and we need to undo the effect
of Ti to ensure atomicity property. In a system that allows concurrent execution.
Tj that is dependent upon on Ti.( Tj reads the data item written by the Ti) also
aborted.
That’s why we need to place some restrictions on that schedules.
5.7.1. Recoverable schedule
Most database system requires that all schedules be recoverable. A recoverable
schedule is one where, for each pairs of transaction Ti and Tj such that Tj reads
a data item previously written by Ti. The commit operation of Ti appears before
the commit operation of Tj.
5.7.2. Cascade less schedule
Consider the example
T10 writes a value that is read by T11. Suppose T10 fails, T10 must be rolled
back. Since T11 is dependent on T10, T11 and T12. Then all the remaining
transaction must be rolled back.
The phenomenon in which a single transaction failure leads to a series of
transaction rollbacks is called Cascading roll back. It is desirable that cascading
roll backing should not be occurs in a schedule. Such schedules are called as
cascade less schedule. For every pairs of transactions, such as Ti and Tj, where
Tj reads the data item written by the Ti, the execution of Ti must finish before Tj.
Then it is easy to identify that recoverable schedule is cascade less schedule.
5.8. Testing for Serializability
Every schedule must be Serializable, we first understand to determine a given
particulars schedule S is Serializable or not.
Let S be a schedule. We construct a directed graph (precedence graph) from S.
G= (V, E)
Where V→ set of vertices
E→ set of edges
Vertices: consists all the transactions that are participating in a schedule.
Edges: Ti→ Tj for which one of the following condition hold.
1. Ti executes write(Q) before Tj executes read(Q)
2. Ti executes read(Q) before Tj executes write(Q)
3. Ti executes write(Q) before Tj executes write(Q)
For any particular schedule S
T1----------→ T2
All the instructions of T1 executes before the first instruction of T2.
5.9. Precedence graph
By using precedence graph scheme it is not conflict serializable. But it is view
serializable. There is an edge T4→ T3 are called useless writes.
To test view serializability, we develop a scheme for deciding whether an edge is
need to be inserted in a precedence graph.
Schedule S
Tj reads a value written by Ti { Ti→ Tj}
If schedule S is view serializable then any schedule S’ i.e equivalent to schedule
S.
Tk executes write(Q)
Then in schedule S’
Tk→ Ti
Either
Tj→ Tk
It can not appear between Ti and Tj.
To test view serializability, we need to extend the precedence graph to include
labeled edges. This types of graph termed as label precedence graph.
Rules for inserting labled edges in precedence graph:
Let us consider a schedule S having transaction s (T1, T2………….Tn)
Let Tb and Tf two transactions
Tb issues write(Q) for each Q accessed in S
Tf issues read(Q) for each Q accessed in S
Now, we construct a new schedule S’ from S by inserting
Tb at the beginning of S
Tf at the end of S.
We construct the labeled precedence graph for schedule S’ as follows.
1. Add an edge Ti→ Tj. If Tj reads the value of a data item Q written by Ti.
2. Remove all edges incident on useless transactions. A transaction Ti is
useless if there exsist no path in the precedence graph, from Ti→ Tf.
3. for each data item Q such that Tj reads a value of Q written by Ti and Tk
executes write(Q) and Tk≠Tb, do the followings:
a. Ti=Tb and Tj ≠ Tf then insert an edge Tj→ Tk.
b. If Ti≠ Tb and Tj=Tf then insert an edge Tk→ Ti.
c. If Ti≠ Tb and Tj≠Tf then insert an edge Tk→ Ti.
And Tj→ Tk in the labled precedence graph. Where P= unique
number.
6. CONCURRENCY CONTROL
When several transactions executes concurrently in the database, the isolation
property may no longer preserved. It is necessary for the system to control the
interaction among concurrent transaction. These types of controls are termed as
concurrency control schemes.
6.1. Lock based protocols
One way to ensure the serializability is to require that access to data item be
done in a mutually exclusive manner. I.e while one transaction is accessing a
data item, no other transaction can modify that data item.
One way to implement this requirement is to allow a transaction to access a data
item if it is currently holding a lock on that data item.
6.1.1. Locks
There are various modes in which a data item may be locked.
Share mode: if a transaction Ti has share mode lock (denoted by S) on the data
item Q, then Ti can read but can not write Q.
Exclusive mode: (denoted by X) then Ti can perform both read and write on Q.
Example:
T1: LOCK-X(B)
READ(B)
B:=B+50;
WRITE(B)
UNLOCK(B)
LOCK-X(A)
READ(A)
A:=A-50;
WRITE(A)
UNLOCK(A);
T2: LOCK-S(A)
READ(A)
UNLOCK(A)
LOCK-S(B)
UNLOCK(B)
DISPLAY(A+B);
Initial amount
A=100$
B=200$
Case1. T1 followed by T2
Case 2. T2 followed by T1
Case 3.
This schedule will display the ( A+B) as 250$.
This situation is called deadlock. When deadlock occurs, the system must roll
back on of the two transactions. The data item that was locked by that
transaction is unlocked. These data items were available to other transactions.
6.1.2. Granting of locks
When a transaction requests a lock on the data item in particular mode, and no
other transaction has a lock on the same data item in a conflicting mode. The
lock can be granted.
Suppose Transaction T2 has a lock and T1 request (has to wait) for T2 release
the exclusive mode lock.
T1 will wait
T2→ lock-S(Q)( has)
T1→ lock-X(Q) (wait)
T3→ lock-S(Q)( request)
T4→ lock-S(Q)( request)
T1 is still waiting. This situation is called as starvation where a particular
transaction continuously waiting for a particular lock on the same data item.
6.1.3. Avoiding starvation of transaction by granting locks
When a transaction Ti request a lock on data item Q in particular mode M, the
lock is granted provided that
1. There is no other transaction holding a lock on Q in a mode that conflict with
M.
2. There is no other transaction that is waiting for a lock on Q and that made its
lock request before Ti.
6.2. Two phase locking protocol
One protocol that ensures serializability is the two phase locking protocol. This
protocol requires that each transaction issue locks and unlock request in two
phases.
1. Growing phase: a transaction may obtain locks but not release any locks.
2. Shrinking phase: a transaction may release locks but may not obtain any
new locks.
When a transaction has obtained its final locks is called the lock point of
the transaction.
Cascading rollbacks can be avoided by a modification of two phase locking
called the strict two-phase locking protocol. This protocol requires that all
exclusive locks taken by the transaction must be held until that transaction
commits.
This requirement ensures that any data item written by an uncommitted
transaction are locked in exclusive mode until the transaction commits,
preventing any other transaction from reading the data. Another type of protocols
is the rigorous two-phase locking protocol. Which requires all lock to be held until
the transaction commits. It can be easily verified that transaction can be
serialized.
6.3. Graph based protocol
If we wish to develop protocol that is not two-phase, we need additional
information on how each transaction will access the database. In this model we
have prior knowledge about the order in which the database item will be
accessed.
To acquire such prior knowledge a particular order on the set D = (d1,
d2……………..dn) of all data items. If di→ dj. Then any transaction di and dj
must access di before accessing dj. This ordering can be shown as a directed
acyclic graph, called database graph. We restricted to employee only exclusive
locks.
In the tree protocol, the only lock allowed is lock-X. Each transaction action Ti
can lock data item at most once and must follows the rules.
a. The first lock by Tj may be on any data item.
b. Subsequently, a data item Q can be locked by Ti only if the parent
of Q is currently locked by Ti.
c. Data item may be unlocked at any time.
d. A data item that has been locked and unlocked by Ti can not be
subsequently be relocked by Ti.
Advantages:
1. Unlocking may access earlier that leads to shorter waiting time and to
increase concurrency.
2. Protocol is deadlock free, no roll backs are required.
Disadvantages:
1. 1. Locking results in increased locking
2. 2. Additional waiting time
3. 3. Potential decrease in concurrency
6.4. Time-stamp based protocol
These types of locking protocol, we use for ordering between every pairs of
conflicting transaction is determines at execution time.
Time-stamp:
With each transaction Ti in the system, we associated a unique fixed time stamp
denoted by TS (Ti).
This time stamp assigned by the database system before the transaction Ti starts
execution.
If TS(Ti) → T0 transaction Ti
New entered transaction Ti TS(Tj)There are two simple methods for
implementing this scheme.
1. Use the value of system clock as the time stamp. That’s a transaction time
stamp is equal to the value of the clock when the transaction enters the
system.
2. Use a logical counter that is incremented after a new time-stamp has been
assigned. Transaction time-stamp is equal to the value of the counter.
6.5. Validation based protocol
In some cases, where the majority of the transactions are read-only transaction,
rate of conflicts among transaction may be low. But we do not know in advance
which transaction will be involved in a conflict. To gain that we need to scheme
for monitoring the system, we assume that each transaction Ti executes in two
phases.
1. Read phase: during this phase, the execution of transaction Ti takes
place, the value of the various data item are read and are stored in
variable local to Ti. All write operations are performed on temporary local
variable, without updating the actual database.
2. Validation phase: transaction Ti performs a validation test to determine
whether it can copy to the database. The temporary local variable that
holds the result of write operation without causing a violation of
serializability.
3. if transaction Ti succeeds in validation ( step 2). Then the actual updates
are applied to the database. Otherwise Ti is roll back.
a. Start (Ti), the time when Ti started its execution.
b. Validation (Ti), the time when Ti finished read phase and started its
validation phase.
c. Finish (Ti), the time, when Ti finished its write phase.
6.6. Recovery system
6.6.1. Failure Classification
There are various types of failure that occurs in the system. Each of which deals
with in a different manners.
Simple failure: does not loss of information in a system.
Difficult failure: of information in a system.
Here we consider only the following types of failure:
6.6.1.1. Transaction failure
There are two types of error that may cause transaction to fail.
Logical error: transaction can no longer proceed with its normal execution. Due
to such as bad input, data not found, overflow or resource limit exceeded.
System errors: The system has entered in undesirable state ( deadlock) as a
result of which a transaction cannot continue with its normal execution. For this
transaction re-execute after.
System crash: Such as bug in the database software, operating system fails,
that causes loss of contents of volatile storage.
Disk failure: Disk blocks loses its contents, either head crash. To recover this
types of failure , tapes are used.
6.6.2. Log based recovery:
The most widely structure for recording database modification is the log.
The log is a sequence of log records and maintains a record of all the update in
the database.
Log records having the following fields:
Transaction identifier:
It is a unique identifier of the transaction that performs write operation.
Data item identifier:
It is identifier of the data item. Basically it is the location of the data item on the
disk.
Old value:
Value of the data item prior to the write operation.
New value:
Value of the data item will have after the write operation.
Log record exist to record significant events during transaction processing.
< Ti, start> transaction Ti has started.
<Ti,Xj,V1,V2> transaction Ti performed write operation on the data item Xj,
has the value V1 before the write , will has V2 after the write.
<Ti, commit> transaction Ti has committed.
<Ti, abort> transaction ti has aborted.
6.7. Deferred Database Modification
In this scheme, when a transaction is partially commits, the information on the log
associated with the transaction is used in executing the deferred writes. If the
system crashes before the transaction completes. Its execution or if the
transaction aborts then the information on the log is simply ignored.
T0:
READ(A)
A:=A-50;
WRITE(A)
READ(B)
B:=B+50;
WRITE(B)
T1:
READ(C)
C:=C-100;
WRITE(C)
6.8. Immediate Database Modification
The immediate update technique allows database modification to be output to the
database while the transaction still in the active state.
Database modifications written by active transaction are called uncommitted
modification. In the event of a crash or transaction failure, the system must use
the old value field of the log records to restore the modified data item.
< T0, start>
< T0,A,1000,950>
< T0,B,2000,2050>
< T0,commit>
< T1, start>
< T1,C,700,600>
< T1,commit>
7. CENTRALIZED AND DISTRIBUTED DATABASE
In the traditional enterprise computing model, an Information Systems
department maintains control of a centralized corporate database system.
Mainframe computers, usually located at corporate headquarters, provide the
required performance levels. Remote sites access the corporate database
through wide-area networks (WANs) using applications provided by the
Information Systems department.
Changes in the corporate environment toward decentralized operations have
prompted organizations to move toward distributed database systems that
complement the new decentralized organization.
Today’s global enterprise may have many local-area networks (LANs) joined with
a WAN, as well as additional data servers and applications on the LANs. Client
applications at the sites need to access data locally through the LAN or remotely
through the WAN. For example, a client in Tokyo might locally access a table
stored on the Tokyo data server or remotely access a table stored on the New
York data server.
Both centralized and distributed database systems must deal with the problems
associated with remote access:
• Network response slows when WAN traffic is heavy. For example, a
mission-critical transaction-processing application may be adversely
affected when a decision-support application requests a large number of
rows.
• A centralized data server can become a bottleneck as a large user
community contends for data server access.
• Data is unavailable when a failure occurs on the network.
7.1. Distributed Database System
A distributed database system is a collection of data that belongs logically to the
same system but is physically spread over the sites of a computer network.
7.2. Some advantages of the DDBMS are as follows:
1. Distributed nature of some database application: some database
application arte naturally distributed over the different sites.
2. Increased reliability and availability: there are two most common
advantages for any database. Reliability is broadly defined as the
probability that a system is up at a particular moments. Availability is the
probability that the system is continuously available during a time interval.
3. Allowing data sharing while maintaining some measures of local
controls: it is possible to control the data & software locally at each site.
However the certain data can be accessed by users at other remote site
through the DBMS software. This allows the controlled sharing of data
through out the distributed system.
4. Improved performance: when a large data is distributed over the multiple
sites, smaller data base exist at each site. As a result, local queries &
transaction accessing data at a single site have better performance
because of the smaller local database. If all the transaction are submitted
to a single centralized database, than the performance will be decreased.
7.3. Some additional properties:
1. The ability to access remote sites and transmit queries and data among
the various sites via a communication network.
2. The ability to decide on which copy of a replicated data item to access.
3. The ability to maintain the consistency of copies of a replicated data item.
4. The ability to recover from individual site crashes and from new types of
failure such as the failure of the communication links.
7.4. Physical hardware level
The following main factors distinguish a DDBMS from a centralized system:
1. There are multiple computers called site or nodes.
2. These sites must be connected by some types of communication network
to transmit data and command among the site.
The site may be within the same building or group of adjacent building via local
area network or they may be geographically distributed over the large distance
and connected via a long haul network. Local area network typically uses cables.
Whereas long haul network use telephone lines or satellites it is also possible to
use a combination of the two types of network. Networks may have different
topologies that define the different communication among sites.
7.5. Client Server Architecture
The client server architecture has been developed to deal with new computing
environment in which a large no. of personal computers, workstations, file
servers, peripherals and others equipments are connected together via a
network. The idea is to define specialized covers with specific functionalities.
The instruction between client and server might proceed as follows during
processing of an SQL query.
1. The client passes a users query and decomposition it into a number of
independent site queries. Each site query is sent to the appropriate
receiver site.
2. Each server processes the local query and sends the resulting relation to
the client site.
3. The client site combines the result of the sub queries to improve the result
of the originally submitted query. In this approach SQL server has called a
database processor (DP) or a back-end machine whereas the client has
been called as application processor (AP) or front-end machine.
The DDBMS, it is to divide the software modules into the three levels.
1. The server software is responsible for local data management at a site.
2. The client software is responsible for most of the distribution function. It
accesses the data distribution information from the DDBMS catalog and
processes all request that require access to more than one site.
3. The communication software provides the communication primitives that
are used by the client to transmit command and data among the various
sites as needed.
7.6. Data fragmentation
If relation r is fragmented, r is divided into a number of fragments r1,
r2……………rn. These fragments contains the sufficient information to allow
reconstruction of the original relation r. this reconstruction can take place through
the application of either the union operation or special types of join operation on
the various fragments.
There are three different types of schemes for fragmenting a relation:
I. Horizontal fragmentation
II. Vertical fragmentation
III. Mixed fragmentation
7.6.1. Horizontal fragmentation
In this each tuple of r is fragment into one or more fragments horizontally. A
relation r is partitioned into a number of subsets r1, r2……………rn. Each tuple of
the relation r must belong to at least one of the fragments so that the original
relation can be reconstructed. These fragments can be defined as a selection
operation.
For reconstruction we uses union operation
R=r1Ur2U……………….rn
7.6.2. Vertical fragmentation
In this each column of r is fragment into one or more fragments vertically. Vertical
fragmentation r(R) involves the subset of attributes R1,R2…………..Rn of the
schema R such that
R=R1UR2U……………….Rn
Each fragments of r is defined by project operation
For reconstruction we uses join operation
R=r1×r2×……………….rn
7.6.3. Mixed fragmentation
Either the horizontal fragments or vertical fragments. A relation r is divided into a
number of fragments R1,R2…………..Rn. each fragments is obtained as the
result of applying either the horizontal fragmentation or vertical fragmentation
scheme on relation r or a fragments of r which was obtained previously.
7.7. Data Replication
If r relation is replicated, a copy of relation r is stored in two or more sites. If we
have full replication in which a copy is stored in every site in the system.
Availability: If one site fails then the relation may found on the other site. This
system may continue the process.
Increased parallelism: where the majority of access to the relation r result in
only the reading the relation. Then the several sites can process the queries
involving r in parallel. Then there is the chance that needed data is found when
the transaction is executing.
Increased overhead on update:
The system must ensure that all replicas of a relation r are consistent; otherwise
error ness computations may result. Whenever r is updated, the update must be
propagating to all sites containing replicas.
7.8. Deadlock handling
A system is in deadlock state if there exist a set of transaction such that every
transaction in a set is waiting for the transaction in the set.
Suppose a set of waiting transaction { T0,T1………Tn}
T0 is waiting for a data item held by T1
.
.
.
.
Tn is waiting for a data item held by T0
No any transaction can make progress in this situation.
There are two principal methods for dealing with deadlock problems:
a. Deadlock prevention: this protocol ensures that system will never enter
in deadlock state.
b. Deadlock detection and recovery: we allow a system to enter in
deadlock state and then they try to recover.
7.8.1. Deadlock prevention
There are two approaches to deadlock prevention
Approach1:
i. No cyclic waits can occurs
ii. All locks to be acquired together.
Approach2:
i. This approach is closer to deadlock recovery
ii. We rollback transactions instead of waiting for deadlock under the first
approach
7.8.1.1. The first approach
Each transaction locks all the data item before it begin it execution.
Disadvantages:
i. It is often hard to predict, before the transaction begins, what data item
need to be locked.
ii. Data item utilization will be very low, since many of data items may be
locked but unused for a long time.
7.8.1.2. The second approach
For preventing the deadlock is to use preemption method and transaction
rollback.
In preemption:
T2 request lock by T1
The lock granted to T1 may be preempted by roll backing back of T1 and
granting of lock to T2.
To control preemption we assign a unique time stamp to each transaction. The
system uses these time stamp only to decide whether a transaction should wait
or roll back.
Two different deadlock prevention schemes are proposed:
1. wait –die: this scheme is based on a non-preemption techniques. Ti
request a data item held by Tj
ti is allowed to wait if
Ti( time stamp)< Tj ( time stamp)
2. wound-wait: preemption techniques and is a counter part to the wait-die
scheme.
Ti request a data item held by tj
Ti is allow to wait only if
Ti( time stamp) >Tj ( time stamp)
7.8.1.3. Time –out based scheme
Another simple technique is based on the lock time outs.
In this approach, the transaction that has requested a lock waits for at most a
specified amount of time. If the lock has not been granted within that time, the
transaction is said to be time out and it rolled back itself and restarts.
Disadvantages:
i. One or more transactions involved in deadlock.
ii. Short a wait result in transaction rollback, even there is no deadlock.
iii. Leading to wasted resources.
iv. Starvation is also possibility with this scheme.
7.8.2. Deadlock detection and recovery
i. If a system does not employ that ensures deadlock freedom, then a
detection & recovery scheme must be used.
ii. In this schemes system determines
iii. Whether a deadlock has occurred, if one has system must attempt to
recover from the deadlock.
To do this, system must
i. Maintains information about the current allocation of data item to
transaction as well as resulting data item.
ii. Develop an algorithm that uses this information to determine whether the
system has entered a deadlock state.
iii. Recover from deadlock, if deadlock exists.
7.8.2.1. Deadlock detection
To describe deadlock we use directed graph called wait-for graph.
Graph consist
G=(V,E)
V→set of vertices (all the transaction in the system)
E→set of edges
7.8.2.1.1. Directed graph
Ti→Tj
Ti is waiting for transaction Tj to release data item that it needs. A deadlock
exists in a system if and only if the wait –for graph contains a cycle. Each
transaction in the cycle is said to be deadlocked. To detect deadlock, the system
maintains the wait-for graph and there search the cycle in the graph.
7.8.2.2. Recovery from the deadlock
When system determines that a deadlock exists. The system must recover from
deadlock. The most common solution is to roll back one or more transaction to
break deadlock. The following actions need to be taken:
1. Select a victim: which one transaction is to be rollback.
a. How long the transaction has completed the task.
b. How many data items the transaction has used.
c. How many more data item the transaction needs for it to complete.
d. How many transactions will be involved in the rollback
2. Rollback: once we have decided that the particular transaction must be
roll back. We must determine how far this transaction should be rolled
back. But for these methods, system requires to maintain the information
about the state of all running transaction.
3. Starvation: when a system determines that a particular transaction never
completes its designated task. This situation is called starvation.
8. SQL (STRUCTURED QUERY LANGUAGE)
SQL (Structured Query Language) is a database sublanguage for querying and
modifying relational databases. It was developed by IBM Research in the mid
70's and standardized by ANSI in 1986.
The Relational Model defines two root languages for accessing a relational
database -- Relational Algebra and Relational Calculus. Relational Algebra is a
low-level, operator-oriented language. Creating a query in Relational Algebra
involves combining relational operators using algebraic notation. Relational
Calculus is a high-level, declarative language. Creating a query in Relational
Calculus involves describing what results are desired.
SQL is a version of Relational Calculus. The basic structure in SQL is the
statement. Semicolons separate multiple SQL statements.
8.1. DDL Statements
DDL stands for data definition language. DDL statements are SQL Statements
that define or alter a data structure such as a table.
DDL statements are used to define the database structure or schema. Some
examples:
• CREATE - to create objects in the database
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...

More Related Content

What's hot

Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Ravinder Kamboj
 
1.4 data independence
1.4 data independence1.4 data independence
1.4 data independenceBHARATH KUMAR
 
Fundamentals of Database system
Fundamentals of Database systemFundamentals of Database system
Fundamentals of Database systemphilipsinter
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database DesignArchit Saxena
 
Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Junaid Nadeem
 
Applications of DBMS(Database Management System)
Applications of DBMS(Database Management System)Applications of DBMS(Database Management System)
Applications of DBMS(Database Management System)chhinder kaur
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMSkoolkampus
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQLRam Kedem
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to DatabaseSiti Ismail
 
Database and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsDatabase and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsZulfiquer Ahmed Amin
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemNishant Munjal
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemNANDINI SHARMA
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management systemPooja Dixit
 
Entity Relationship Diagrams
Entity Relationship DiagramsEntity Relationship Diagrams
Entity Relationship Diagramssadique_ghitm
 

What's hot (20)

Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)
 
1.4 data independence
1.4 data independence1.4 data independence
1.4 data independence
 
Fundamentals of Database system
Fundamentals of Database systemFundamentals of Database system
Fundamentals of Database system
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database Design
 
Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1
 
Dbms slides
Dbms slidesDbms slides
Dbms slides
 
Applications of DBMS(Database Management System)
Applications of DBMS(Database Management System)Applications of DBMS(Database Management System)
Applications of DBMS(Database Management System)
 
Data models
Data modelsData models
Data models
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS
 
Database structure
Database structureDatabase structure
Database structure
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
 
Dbms
DbmsDbms
Dbms
 
Database and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health InformaticsDatabase and Database Management (DBM): Health Informatics
Database and Database Management (DBM): Health Informatics
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Entity relationship modelling
Entity relationship modellingEntity relationship modelling
Entity relationship modelling
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management system
 
Entity Relationship Diagrams
Entity Relationship DiagramsEntity Relationship Diagrams
Entity Relationship Diagrams
 

Similar to Introduction to DBMS - Notes in Layman...

Analysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edAnalysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edJavier Buitrago Gantiva
 
Internship report 2007eit043
Internship report 2007eit043Internship report 2007eit043
Internship report 2007eit043Isha Jain
 
Skripsi - Daftar Isi
Skripsi - Daftar IsiSkripsi - Daftar Isi
Skripsi - Daftar IsiRian Maulana
 
Jboss4 clustering
Jboss4 clusteringJboss4 clustering
Jboss4 clusteringshahdullah
 
Self optimizing networks-benefits of son in lte-july 2011
Self optimizing networks-benefits of son in lte-july 2011Self optimizing networks-benefits of son in lte-july 2011
Self optimizing networks-benefits of son in lte-july 2011navaidkhan
 
A Real Time Application Integration Solution
A Real Time Application Integration SolutionA Real Time Application Integration Solution
A Real Time Application Integration SolutionMatthew Pulis
 
My "Grain Motion Detection" Project
My "Grain Motion Detection" ProjectMy "Grain Motion Detection" Project
My "Grain Motion Detection" Projectsaveli4
 
Impact of Corporate Governance on Leverage and Firm performance: Mauritius
Impact of Corporate Governance on Leverage and Firm performance: MauritiusImpact of Corporate Governance on Leverage and Firm performance: Mauritius
Impact of Corporate Governance on Leverage and Firm performance: MauritiusAkshay Ramoogur
 
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011Petrona Frensel M
 
Determination of individual competencies by statistical methods yuksek lisans...
Determination of individual competencies by statistical methods yuksek lisans...Determination of individual competencies by statistical methods yuksek lisans...
Determination of individual competencies by statistical methods yuksek lisans...Tulay Bozkurt
 
Best Practices for Acquiring IT as a Service
Best Practices for Acquiring IT as a ServiceBest Practices for Acquiring IT as a Service
Best Practices for Acquiring IT as a ServiceDaniel Checchia
 
Management by competencies tulay bozkurt
Management by competencies tulay bozkurtManagement by competencies tulay bozkurt
Management by competencies tulay bozkurtTulay Bozkurt
 

Similar to Introduction to DBMS - Notes in Layman... (20)

Drools expert-docs
Drools expert-docsDrools expert-docs
Drools expert-docs
 
Analysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edAnalysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd ed
 
Report on dotnetnuke
Report on dotnetnukeReport on dotnetnuke
Report on dotnetnuke
 
Internship report 2007eit043
Internship report 2007eit043Internship report 2007eit043
Internship report 2007eit043
 
It project development fundamentals
It project development fundamentalsIt project development fundamentals
It project development fundamentals
 
Skripsi - Daftar Isi
Skripsi - Daftar IsiSkripsi - Daftar Isi
Skripsi - Daftar Isi
 
Jboss4 clustering
Jboss4 clusteringJboss4 clustering
Jboss4 clustering
 
Self optimizing networks-benefits of son in lte-july 2011
Self optimizing networks-benefits of son in lte-july 2011Self optimizing networks-benefits of son in lte-july 2011
Self optimizing networks-benefits of son in lte-july 2011
 
A Real Time Application Integration Solution
A Real Time Application Integration SolutionA Real Time Application Integration Solution
A Real Time Application Integration Solution
 
Course lab 2_guide_eng
Course lab 2_guide_engCourse lab 2_guide_eng
Course lab 2_guide_eng
 
Course lab 2_guide_eng
Course lab 2_guide_engCourse lab 2_guide_eng
Course lab 2_guide_eng
 
My "Grain Motion Detection" Project
My "Grain Motion Detection" ProjectMy "Grain Motion Detection" Project
My "Grain Motion Detection" Project
 
Deform 3 d v6.0
Deform 3 d v6.0Deform 3 d v6.0
Deform 3 d v6.0
 
Deform 3 d v6.0
Deform 3 d v6.0Deform 3 d v6.0
Deform 3 d v6.0
 
Impact of Corporate Governance on Leverage and Firm performance: Mauritius
Impact of Corporate Governance on Leverage and Firm performance: MauritiusImpact of Corporate Governance on Leverage and Firm performance: Mauritius
Impact of Corporate Governance on Leverage and Firm performance: Mauritius
 
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011
Self optimizing%20 networks-benefits%20of%20son%20in%20lte-july%202011
 
Determination of individual competencies by statistical methods yuksek lisans...
Determination of individual competencies by statistical methods yuksek lisans...Determination of individual competencies by statistical methods yuksek lisans...
Determination of individual competencies by statistical methods yuksek lisans...
 
Best Practices for Acquiring IT as a Service
Best Practices for Acquiring IT as a ServiceBest Practices for Acquiring IT as a Service
Best Practices for Acquiring IT as a Service
 
Operations research
Operations researchOperations research
Operations research
 
Management by competencies tulay bozkurt
Management by competencies tulay bozkurtManagement by competencies tulay bozkurt
Management by competencies tulay bozkurt
 

More from Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU

More from Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU (20)

All About DBMS - Interview Question and Answers
All About DBMS - Interview Question and AnswersAll About DBMS - Interview Question and Answers
All About DBMS - Interview Question and Answers
 
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
Concept of Governance - Management of Operational Risk for IT Officers/Execut...Concept of Governance - Management of Operational Risk for IT Officers/Execut...
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
 
Emerging Technologies in IT
Emerging Technologies in ITEmerging Technologies in IT
Emerging Technologies in IT
 
CASE (Computer Aided Software Design)
CASE (Computer Aided Software Design)CASE (Computer Aided Software Design)
CASE (Computer Aided Software Design)
 
SOFTWARE RELIABILITY AND QUALITY ASSURANCE
SOFTWARE RELIABILITY AND QUALITY ASSURANCESOFTWARE RELIABILITY AND QUALITY ASSURANCE
SOFTWARE RELIABILITY AND QUALITY ASSURANCE
 
Software Testing (Contd..) SDLC Model
Software Testing (Contd..) SDLC ModelSoftware Testing (Contd..) SDLC Model
Software Testing (Contd..) SDLC Model
 
Software Testing - SDLC Model
Software Testing - SDLC ModelSoftware Testing - SDLC Model
Software Testing - SDLC Model
 
Coding - SDLC Model
Coding - SDLC ModelCoding - SDLC Model
Coding - SDLC Model
 
Software Design - SDLC Model
Software Design - SDLC ModelSoftware Design - SDLC Model
Software Design - SDLC Model
 
Models of SDLC (Contd..) & Feasibility Study
Models of SDLC (Contd..)  & Feasibility StudyModels of SDLC (Contd..)  & Feasibility Study
Models of SDLC (Contd..) & Feasibility Study
 
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKETCLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
 
Application Software
Application SoftwareApplication Software
Application Software
 
Application Software – Horizontal & Vertical Software
Application Software – Horizontal & Vertical SoftwareApplication Software – Horizontal & Vertical Software
Application Software – Horizontal & Vertical Software
 
Software: Systems and Application Software
Software:  Systems and Application SoftwareSoftware:  Systems and Application Software
Software: Systems and Application Software
 
Programming Languages
Programming LanguagesProgramming Languages
Programming Languages
 
Number Codes and Registers
Number Codes and RegistersNumber Codes and Registers
Number Codes and Registers
 
Introduction to Computer Programming
Introduction to Computer ProgrammingIntroduction to Computer Programming
Introduction to Computer Programming
 
PROGRAMMING AND LANGUAGES
PROGRAMMING AND LANGUAGES  PROGRAMMING AND LANGUAGES
PROGRAMMING AND LANGUAGES
 

Recently uploaded

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Recently uploaded (20)

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 

Introduction to DBMS - Notes in Layman...

  • 1. Introduction to DBMS By Dr. Kamal Gulati For more Notes check at www.mybigdataanalytics.in Table of Contents 1. INTRODUCTION....................................................................................................... 4 1.1. DBMS Definitions ................................................................................................... 6 1.1.1. Database........................................................................................................ 6 1.1.2. DBMS ........................................................................................................... 6 1.1.3. Database system............................................................................................ 6 1.2. Components of database .......................................................................................... 6 1.2.1. Database administrator (DBA) ..................................................................... 6 1.2.2. Database designer ......................................................................................... 6 1.2.3. End users....................................................................................................... 6 1.3. Advantages of DBMS.............................................................................................. 7 1.4. Disadvantage in File Processing System ................................................................. 7 2. DATA MODELS........................................................................................................ 8 2.1. Categories of data models........................................................................................ 8 2.2. Schemas and instances............................................................................................. 8 2.3. DBMS architecture .................................................................................................. 9 2.4. Data independence................................................................................................. 10 2.4.1. Logical data independence.......................................................................... 10 2.4.2. Physical data independence ........................................................................ 10 2.5. Classification of database management system..................................................... 10 2.5.1. Relational data model ................................................................................. 10 2.5.2. Network data model.................................................................................... 10 2.5.3. Hierarchal data model................................................................................. 11 2.5.4. Object oriented data model......................................................................... 11 2.6. Database languages and interfaces......................................................................... 11 2.6.1. DBMS languages ........................................................................................ 11 2.6.2. DBMS interfaces......................................................................................... 12 2.7. Database system environment................................................................................ 13 2.7.1. Data manager: ............................................................................................. 13 2.7.2. DDL compiler............................................................................................. 13 2.7.3. Run-time database processor ...................................................................... 14 2.7.4. Query compiler ........................................................................................... 14 2.7.5. Pre-compiler................................................................................................ 14 2.8. Entity Relationship Model..................................................................................... 14 2.8.1. Entities and attributes.................................................................................. 14
  • 2. 2.8.2. Entity types, entity sets, keys and values sets............................................. 15 2.8.3. Relationship types, sets and instances ........................................................ 16 2.8.4. Notations for ER diagram........................................................................... 18 2.8.5. Generalization............................................................................................. 19 2.8.6. Aggregation................................................................................................. 21 3. RELETIONAL MODEL .......................................................................................... 21 3.1. Characteristics of relation ...................................................................................... 22 3.2. Operations of the relation model............................................................................ 22 3.3. Relational algebra operation .................................................................................. 23 3.4. Set theoretic operation ........................................................................................... 23 3.4.1. Union........................................................................................................... 23 3.4.2. Intersection.................................................................................................. 24 3.4.3. Set difference .............................................................................................. 24 3.4.4. Join operation.............................................................................................. 24 3.4.5. Division operation....................................................................................... 25 3.4.6. Aggregate function...................................................................................... 25 3.4.7. COUNT....................................................................................................... 25 3.4.8. Grouping ..................................................................................................... 25 3.4.9. Recursive closure operation........................................................................ 25 3.4.10. Outer join .................................................................................................... 26 3.5. Tuple relational calculus........................................................................................ 26 3.5.1. Expression and formulas in tuples calculus................................................ 27 3.5.2. Existence and universal quantifier.............................................................. 28 3.5.3. Rules for the definition of a formula........................................................... 28 3.6. Transforming the universal and existential quantifier ........................................... 29 3.6.1. Domain relational calculus ......................................................................... 29 4. Database Design........................................................................................................ 30 4.1. Schema Refinement ............................................................................................... 30 4.1.1. Guidelines for relation schema ................................................................... 30 4.2. Functional Dependencies....................................................................................... 31 4.2.1. Interference rules for Functional Dependencies ......................................... 32 4.2.2. Axioms to check if FD holds ...................................................................... 32 4.2.3. An Algorithm to Compute Attribute Closure X+ with respect to F ........... 33 4.3. NORMALIZATION.............................................................................................. 33 4.3.1. Basics of normal forms ............................................................................... 33 4.4. Inclusive dependency............................................................................................. 41 5. TRANSACTION MANAGEMENT ........................................................................ 42 5.1. Transaction Concept .............................................................................................. 42 5.2. Transaction state .................................................................................................... 43 5.3. Implementation of atomicity and durability .......................................................... 44 5.4. Concurrent Execution ............................................................................................ 44 5.5. Schedule................................................................................................................. 45 5.6. Serializability......................................................................................................... 46 5.6.1. Conflict Serializability................................................................................ 46 5.6.2. View Serializability .................................................................................... 48 5.7. Recoverability........................................................................................................ 48
  • 3. 5.7.1. Recoverable schedule.................................................................................. 48 5.7.2. Cascade less schedule ................................................................................. 49 5.8. Testing for Serializability ...................................................................................... 49 5.9. Precedence graph ................................................................................................... 50 6. Concurrency control.................................................................................................. 51 6.1. Lock based protocols ............................................................................................. 51 6.1.1. Locks........................................................................................................... 51 6.1.2. Granting of locks......................................................................................... 54 6.1.3. Avoiding starvation of transaction by granting locks................................. 54 6.2. Two phase locking protocol................................................................................... 54 6.3. Graph based protocol............................................................................................. 55 6.4. Time-stamp based protocol.................................................................................... 55 6.5. Validation based protocol ...................................................................................... 56 6.6. Recovery system .................................................................................................... 56 6.6.1. Failure Classification .................................................................................. 56 6.6.2. Log based recovery:.................................................................................... 57 6.7. Deferred Database Modification............................................................................ 58 6.8. Immediate Database Modification......................................................................... 58 7. Centralized and Distributed Database....................................................................... 59 7.1. Distributed Database System................................................................................. 59 7.2. Some advantages of the DDBMS are as follows:.................................................. 59 7.3. Some additional properties: ................................................................................... 60 7.4. Physical hardware level ......................................................................................... 60 7.5. Client Server Architecture ..................................................................................... 61 7.6. Data fragmentation................................................................................................. 62 7.6.1. Horizontal fragmentation............................................................................ 62 7.6.2. Vertical fragmentation ................................................................................ 62 7.6.3. Mixed fragmentation................................................................................... 62 7.7. Data Replication..................................................................................................... 62 7.8. Deadlock handling ................................................................................................. 63 7.8.1. Deadlock prevention ................................................................................... 63 7.8.2. Deadlock detection and recovery................................................................ 64 8. SQL (Structured Query Language)........................................................................... 66 8.1. DDL Statements..................................................................................................... 66 8.1.1. Implicit commits......................................................................................... 67 8.1.2. Data dictionary............................................................................................ 67 8.2. DML....................................................................................................................... 68 8.3. Language Structure ................................................................................................ 68 8.4. Basic SQL Queries................................................................................................. 68 8.4.1. SQL data statements ................................................................................... 69 8.4.2. SQL-Transaction Statements ...................................................................... 72 8.4.3. SQL-Schema Statements ............................................................................ 72 8.5. Union, Intersect and Except................................................................................... 75 8.5.1. ALL............................................................................................................. 76 8.6. Cursors................................................................................................................... 79 8.6.1. Explicit Cursors .......................................................................................... 79
  • 4. 8.6.2. Implicit Cursors .......................................................................................... 80 8.7. Triggers.................................................................................................................. 81 8.7.1. Creating Triggers ........................................................................................ 81 8.8. Dynamic SQL ........................................................................................................ 82 9. QBE........................................................................................................................... 83 10. Query Processing and Optimization ...................................................................... 83 10.1. Query Processing ................................................................................................. 83 10.2. Query Optimizing ................................................................................................ 85 10.3. Indexes ................................................................................................................. 85 10.4. Selectivities.......................................................................................................... 86 10.5. Uniformity............................................................................................................ 86 10.6. Disjunctive Clauses.............................................................................................. 87 10.7. Join Selectivities .................................................................................................. 88 10.8. Views ................................................................................................................... 89 11. OODBMS .............................................................................................................. 90 11.1. Characteristics of Object-Oriented Database....................................................... 90 11.2. Advantage of OODBMS...................................................................................... 91 11.3. Disadvantage of OODBMS ................................................................................. 92 12. ORACLE................................................................................................................ 92 12.1. Storage ................................................................................................................. 92 12.2. Database Schema ................................................................................................. 92 12.3. Memory architecture............................................................................................ 93 12.3.1. Library cache .............................................................................................. 93 12.3.2. Data dictionary cache.................................................................................. 94 12.3.3. Program Global Area .................................................................................. 94 12.4. Configuration....................................................................................................... 95 13. Objective Questions............................................................................................... 95 1. INTRODUCTION A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of the database of an organization and its end users. It allows organizations to place control of organization-wide database development in the hands of database administrators (DBAs) and other specialists. DBMSes may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. It helps to specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, and restoring database. The first DBMS appeared during the 1960's at a time in human history where projects of momentous scale were being contemplated, planned and engineered.
  • 5. Never before had such large datasets been assembled in this new technology. Problems on the floor were identified and solutions were researched and developed - often in real-time. The DBMS became necessary because the data was far more volatile than had earlier been planned, and because there were still major limiting factors in the costs associated with data storage media. Data grew as a collection, and it also needed to be managed at a detailed transaction by transaction level. In the 1980's all the major vendors of hardware systems large enough to support the evolving needs of evolving computerized record keeping systems of larger organizations, bundled some form of DBMS with their system solution. The first DBMS species were thus very much vendor specific. IBM as usual led the field, but there were a growing number of competitors and clones whose database solutions offered varying entry points into the bandwagon of computerized record keeping systems.
  • 6. 1.1. DBMS Definitions Some of the technical terms of DBMS are defined as below: 1.1.1. Database A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose. Ex: consider the name, telephone number, and addresses. You can record this data in an indexed address book. For maintain database we generally use such software DBASE IV, Ms-Access or Excel 1.1.2. DBMS It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications. 1.1.3. Database system The database and DBMS software together is called as Database system. 1.2. Components of database 1.2.1. Database administrator (DBA) In many organizations where many persons use the same resources, there is a need for a chief administrator to manage these resources. In a database environment, the primary resource is the database itself and the secondary resource is the DBMS and the related software. To manage these resources, we need the database administrator. DBA is responsible for authorizing access to the database and for acquiring S/W and H/W resource as needed. 1.2.2. Database designer They are responsible for identifying the data to be stored in the database and for choosing appropriate structure to represent and store this data. The responsibility of the database designer is to communicate with the database user and to understand their requirement. 1.2.3. End users
  • 7. These are the persons whose jobs requires to access to the database for querying, updating and generating the reports. The databases generally exist for their use. There are several categories of end users: A. Casual end users: who occasionally access the database but they need different information each time. B. Parametric end user: make up a sizable portion of the database end user their main job function involves constantly querying and updating the database. By using standard types of queries and updates called canned transaction tat have been carefully programmed and tested. Such as bank tellers' checks accounts balances, withdraws and deposits. C. Sophisticated end users: includes engineers, scientist, and business analyst who toughly familiarize with the facilities of the DBMS so as to implement their application to meet the complex requirement. D. Stand alone end users: maintains personal database by using readymade software that provide easy to use menu or graphical based interface. Ex: tax packages that store a variety of personal financial data for tax purpose. E. System analyst and application programmer: System analyst determines the requirement of the end users, especially parametric end users and develops specification for the canned transaction to meet their requirement. Application programmer implements these specifications as programs then they test, debug document and maintain these canned transaction. These programmers are known as software engineer. 1.3. Advantages of DBMS 1. Controlling redundancy 2. Restricting unauthorized access 3. Providing persistent storage for program object and data structure 4. Database interfacing 5. Providing multiple user interface 6. Presenting complex relationship among data 7. Enforcing integrity constraints 8. Providing backup and recovery 1.4. Disadvantage in File Processing System 1. Data redundancy & inconsistency. 2. Difficult in accessing data.
  • 8. 3. Data isolation. 4. Data integrity. 5. Concurrent access is not possible. 6. Security Problems. 2. DATA MODELS Data model is a set of concepts that can be used to describe the structure of the data base. By the structure of the database as data type, relationship and constraints that should hold for the data. Most of the data items also include a set of basic operations for specifying the modification on the data. 2.1. Categories of data models A. High level or conceptual data model: that describe how the user will use the database. High-level data model uses concepts such as entities, attributes and relationship. B. Entity: represents real world objects such as employee or project that is stored in the database. C. Attribute: represents some property of interest that further describes the entity such as employee name or salary. D. Relationship: it represents the relationship between two or more entity. Low level or physical data model: that describe how the data is stored in the computer. E. Representational or implementation data model: it hides some of the details of data storage but can be implemented on a computer system in a direct way. 2.2. Schemas and instances The description of the database is called the database schema. The database schema is specified during the database design. The displayed schema is called a schema diagram and is not change frequently. The actual data in the database may change frequently. In a data base changes occur every time. We add a new student or entry a new grade for a student. The
  • 9. data in the database at the particular moment of time is called the database state or instance or snapshot. 2.3. DBMS architecture Three important characteristics of the database 1. Insulation of program and data 2. Support of multiple user view 3. Use of catalogue to store the database schema The architecture of the database system is called as three- schema architecture 1. Internal schema 2. Conceptual schema 3. External schema 1. Internal schema: it describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access path for the database. 2. Conceptual schema: it describes the structure of a whole database for a community of users. The conceptual schema hides the details of physical storage structure. High-level data model or an implementation data model can be used at this level. 3. External schema: it describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group.
  • 10. 2.4. Data independence Three schema architecture can be used to explain the concepts of data independence which can be defines the capacity to change the schema at one level of the database system without change the schema at the next higher level. There are two types of data independence: 2.4.1. Logical data independence This is the capacity to change the conceptual schema without having to change external schema or application programs. We can change the conceptual schema to expand the database or to reduce the database. 2.4.2. Physical data independence This is the capacity to change the internal schema without having to change the conceptual or external schema. Changes to the external schema may be needed because some physical files have to be reorganized. Ex: by creating additional access structure to improve the performance of retrieval or updates. 2.5. Classification of database management system We can categorize the DBMS as follows: 1. Relational data model 2. Network data model 3. Hierarchal data model 4. Object oriented data model 2.5.1. Relational data model Relational data model represents a database as a collection of tables where each table is stored as a separate file. Most relational database has high level query language and support a limited form of users view. 2.5.2. Network data model Represent data as a record type and also represent a limited type of 1:N relationship, called a set of types.
  • 11. 2.5.3. Hierarchal data model It represents data as hierarchal tree structure. Each hierarchy represents a number of related records. There is no standard language for hierarchal model. 2.5.4. Object oriented data model It define a database in term of objects their properties and their operations. Objects with the same structure and behavior belong to a class and classes are organized into a hierarchy and cyclic graph. 2.6. Database languages and interfaces 2.6.1. DBMS languages The first thing is to specify conceptual and internal schema for the database and any mapping between two. In many DBMS where no strict separation of levels is maintained one language called the data definition language (DDL) is used by the DBA and data base designer to define both schemas. In DBMS, there is a DDL compiler, whose function is to process DDL statements in order to identify the description of the schema constructs and to store the schema description in the DBMS catalog. Where the clear separation of • Conceptual schema • Internal schema A. Then DDL is used to specify conceptual schema only. B. SDL (storage definition language) is used to specify internal schema only. Mapping between two levels is specifying by the any of the two languages. In some DBMS VDL (view definition language) is used to specify the users view and their mapping to the conceptual schema. But in most DBMS, DDL is used to specify both conceptual and external schema. Once the database schema is
  • 12. created and database is filled with data. Users must have to manipulate the database. Manipulations include: • Retrieval • Insertion • Deletion • Modification For that purpose DBMS provides DML (database manipulation language). 2.6.1.1. DML database manipulation language There are two main type of DML’s: 1. High-level or nonprocedural DML( SQL) 2. Low level or procedural DML 1. High-level or nonprocedural DML: can be used to specify complex database operations. Many DBMS allows high-level DML statement either to be entered interactively from a terminal or to be embedded in a general purpose programming language. DML statement must be identified within the program so that they can be extracted by a pre-compiler and processed by the DBMS. High-level DML such as SQL can be specify and retrieve many records in a single DML statement and hence are called set-at-a-time or set-oriented DML’s. 2. Low level or procedural DML: must be embedded in a general purpose programming language. This type of DML typically retrieves individual records or objects from the database and processes each separately. Hence it needs to use programming language, such as looping, to retrieve and process each record from a set of records. Low-level DML are also called record-at-a-time DML because of this property. Whenever DML commands, high/low level are embedded in a general purpose programming language that language is called the host language and the DML is called the data sub language. On the other hand, high level DML used in a stand-alone interactive manner is called a query language. 2.6.2. DBMS interfaces User friendly interfaces provided by a DBMS may include the following: Menu based interfaces: these interfaces present the user within list of options, called menus, which lead the user through the formulation of a request. The query is composed step-by-step by picking option from a menu that is displayed by the system.
  • 13. Forms based interface: a form-based interface display a form to each users. Users can file out all of the form entries to insert new data or they file only certain entries. Forms are actually designed and programmed for parametric end users. Graphical user interface: GUI displays a schema to the user. User can then specify a query by manipulating the diagram. Most GUI uses a pointing device as mouse to pick up the certain part of the displayed schema. Natural language interface: natural language interface refers to the world in its schema as well as a set of standard word to interpret the request. If the interpretation is successful, the interface generate a high level query corresponding to the natural language request and submit it to the DBMS for processing. Interfaces for parametric users: parametric users, such as bank teller, often have a small set of operations that they must perform repeatedly. System analyst and programmer designed and implement a special interface for parametric user. They generate keys by which that command automatically runs. Interfaces for the DBA: the DBA staff uses these interfaces. These commands are for creating accounts, setting system parameters, granting account authorization, changing a schema and reorganizing the storage structure of a database. 2.7. Database system environment The database and the DBMS catalog are usually stored on the disk. Access to the disk is controlled primarily by the operating system, which schedules disk input/ outputs. 2.7.1. Data manager: Modules of the DBMS controls: A. Access to the DBMS information i.e. Stored on the disk. B. It uses some basic OS services for carrying out low level data transfer between the disk and computer main storage. C. Handling buffers in the main memory. 2.7.2. DDL compiler It processes schema definition specified in the DDL. The stored description of the schemas in the DBMS catalog DBMS catalog: includes the following information • Name of the files • Data items • Storage details of each file • Mapping information
  • 14. 2.7.3. Run-time database processor It handles database accesses. It receives retrieval or updates operations and carries them to the database. Access to the disk goes through the stored data manager. 2.7.4. Query compiler Handles high level queries that are entered interactively and then generates calls to the run time processors for executing the codes. 2.7.5. Pre-compiler Extracts DML commands from an application program written in a host language. Then commands send to the DML compiler for compilation of object code. 2.8. Entity Relationship Model For designing a successful database application there are two terms that play major role in the designing of database application: • Database application • Application program Database application: refers to a particular database (bank database) and associate program implements the queries and updates. Example: program that’s implements database updates corresponding to customers. Making deposits and withdraws these program provides user friendly graphical interfaces (GUI’s) utilizing forms and the testing of these application program. 2.8.1. Entities and attributes Entities: the basic object that the ER model represents is an entity. The entity may be an object with a physical existence –a particular person, car, house or employee or it may be an object with conceptual existence – a company, a job or a universally course. Attribute: a particular property that describes the entity. Ex: entity –employee may be describe by the employee’s name, age, address, salary and job. Composite attribute: composite attribute can be divided into the subparts which represents more basic attributes with independent meaning.
  • 15. Simple or atomic attribute: Attributes that are not divisible are called simple or atomic attribute. Single valued attributes: most attribute have a single value for a particular entity, such attribute are called single valued attribute. Ex: (age) single valued attribute for person. Multi valued attributes: the attributes, which may have more than one value. Colors attributes of a car. Car with one color have a single value where cars may have multiple values. Such attributes are called multi-valued attributes. Stored attributes: in some cases two attributes values are related. Ex: age and birth date of person. The value of an age can be determined by the current data and the value of the person’s birth date. The age attribute is called the derived attribute and the birth date is called the stored attribute. Null values: in some cases a particular entity may not have appropriate value for an attribute. Ex: apartment number Complex attribute: we represent composite attribute between parenthesis () and separating the components by commas. Multi valued attributes by { }. Such attributes are called complex attribute. {Address phone ({phone (area code, phone number)})} 2.8.2. Entity types, entity sets, keys and values sets Entity types: an entity types defined a collection ( or sets ) of entities that have the same attributes. Each entity type in the database is described by its name and attributes. Entity sets: the collection of all entities of a particular entity type in the database at any point in time is called an entity sets.
  • 16. Key attributes: an entity type usually have key attribute whose values are distinct for each individual entity in the collection. Such an attribute is called the key attribute. Values sets (domain of attribute) each simple attribute of an entity type is associated with a value set (or domain of value), which specify the set of values that may be assigned to that attribute for each individual entity. Ex: employee Age specify in the range 16 to 70. 2.8.3. Relationship types, sets and instances An association among entities is called a relationship. Relationship type R among n entity types E1, E2……………..En defines a set of associations. In another word, the attribute set R is a set of relationship instances.
  • 17. Degree of relationship type: The degree of relationship type is the number of participated entry types. Ex: work for relationship is of degree two. Degree two- binary relationship Degree three - ternary relationship Role name: each entity type that participates in a relationship type plays a particular role in relationship. The role name specify the role that a particular entity from the entity play in each relationship instances and helps to explain what the relationship means. Recursive relationship: Role name is not important where all the participating entity type is distinct, since each entity type name can be used as the role name. In some cases, some entity type participates in more than one in a relationship type in different roles. In such cases role name becomes essential for
  • 18. distinguishes the meaning of each participation. Such relationship types are called recursive relationships. Employee and supervisor entities are the member of the same employee entity types. Weak entity type: The entity types that do not have key attribute are called weak entity type. Weak entity type some times called the child entity type. Regular/ strong entity type: that have key attribute are called the regular or strong entity type. Identifying entity type is also some time called the parent entity type or dominant entity type. 2.8.4. Notations for ER diagram Symbols Meaning
  • 19. 2.8.5. Generalization We think of a reverse process of abstraction in which we suppress the differences among several entity type, identifying their common features and generalize them into a single super class.
  • 20.
  • 21. 2.8.6. Aggregation Aggregation is an abstraction concept for building composite object from their component objects. There are calls where this concept can be used and related to EER module. • Where we aggregate attribute value of an object to form the whole object. • When we represent an aggregate relationship as an ordinary relationship. • Combining objects that are related by a particular relation instances. 3. RELETIONAL MODEL The relational model represents the database as a collection of relations. Relation is thought of as a table of values, each row in the table represents a collection of related data values. In relational model, each row in the table corresponds to entity or relationship. In a relational model concept, a row is called a tuples, columns are called as attributes, and the table is called a relation. The data type describing the type of values that can appear in each column is called a domain. Domain: The domain D is a set of atomic values. Atomic means that each value in the domain is indivisible. USA_phone_number- 10 .digits
  • 22. Relation schemas: R is denoted as R(A1,A2,A3.......An) R is the relation name Ai attributes for I=1,2,3,.....n Student (name, SSN, home phone, address, office phone, age) 3.1. Characteristics of relation 1. Ordering of tuple in relation: a relation is defined as a set of tuple. Tuples in a relation do not have any specific meaning. 2. Ordering of values within a type: n-type is an ordered list of n- values, so ordering of value in a type. Attributes values are with in types of order. 3. Values in the tuples: each value in a tuple is a atomic value. I.e. it is not divisible into its components. In a relational model concepts composite and multi valued attributes are not allowed. 4. Interpretation of relation: the relation schema can be interpreted as a declared or type of assertion. Relational constraints: in this relational constraints we will study about the restrictions apply on the database schema. These includes Domain constraints: it specifies that the value of each attribute must be atomic value. Key constraints: a relation is defined as a set of tuples. All elements of sets are distinct. Hence all tuples in the relation must be distinct. No two tuple can have the same combination of all their attribute values. Entity integrity constraints: no primary key value can be null, because it is used to identify the individual tuples n a relation. Referential integrity constraints: it is specified between two relations and is used to maintain the consistency among tuples of the two relations. It is based on the foreign key concepts. 3.2. Operations of the relation model Operations on the relational model can be categorized into retrieval and updates. There are three basic updates operations on relations. Insert operation: it provides a list of attributes for a new tuple t that can be inserted into a relation R. Delete operation: it is used to delete a tuple from a relation if the tuple is being deleted as referenced by the foreign key from other tuple in the database. We use condition to delete the tuple. Ex: delete from employee Where SSN=. 985676;
  • 23. Update operation: is used to change the value of one or more attribute in a tuple of relation R. Ex: update employee Set age=.25. Where SSN=.576787; 3.3. Relational algebra operation 1. Select operation: is used to select the subset of the tuples from a relation that specify a selection condition or it is used to select some of the row from a relation. 2. Project operation: it is used to select some of the column (set of attribute) from a relation. 3. Rename operation: which is used to rename either relation name or attribute names or both. Rename (old table name) to (new table name) 3.4. Set theoretic operation Several set theoretic operations are used to merge the elements of two sets in various ways. These operations are as follows. 3.4.1. Union The result of this operation is denoted by the R U S, is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated. R U S = S U R {commutative operation} Select salesman 'ID", name From sales_master Where city =.mumbai. union Select client "ID" , name
  • 24. From client_master Where city =.mumbai.; 3.4.1.1. Restrictions on using an union operation is as follows 1. The number of column in all the queries should be same. 2. The data type of the column in each query must be the same. 3. Union cannot be used in the sub query. 4. Aggregate function cannot be used with union clause. 3.4.2. Intersection The result of this operation is denoted by RП S is a relation that include all tuples that are in both R and S. Select salesman "ID", name From sales_master Where city =.mumbai. intersect Select client "ID", name From client_master Where city =.mumbai.; 3.4.3. Set difference The result of this operation is denoted by R.S, is a relation that includes all tuples that are in in R but not in S. Selecr product_no from product_master Minus Select product_no from sales_order; 3.4.4. Join operation Denoted by X, is used to combine related tuples from two relations into a single tuple. This operation is very important because it allow us to process relationship among relations. R X (join condition) S There are some categories of join operations. 1. Cartesian product (cross product) or (cross Join); the main difference between the Cartesian product and join, in join, only combination of tuples satisfy the join condition appear in the result.
  • 25. 2. equi join: where only comparison operator is used =, is called the equi join. Each pair of attributes with identical value is spurious. Removal of spurious tuples is followed by natural join R * S. 3.4.5. Division operation Division operation if used for special kind of query that sometimes occurs in database application. 3.4.6. Aggregate function On collection of values from the database, these functions are as follows: SUM, AVERAGE, MAX, MIN 3.4.7. COUNT This function is used to count tuples and attributes. 3.4.8. Grouping This is used to group the attribute of any relation. Select company, sum (amount) from sales Group by company Having sum (amount) > 10,000; 3.4.9. Recursive closure operation This operation is used a recursive relationship.
  • 26. 3.4.10. Outer join Natural join is denoted by R * S Where R, S are relations Only tuples from R that have matching tuple in S will be selected in the result and without matched tuples are eliminated. Null tuples also eliminated. A set of operations, called outer join can be used when we want to keep all the tuples in R and S or in both. The relations whether they match or not. Outer join is used to take the union of tuples of twp relations, if the relation is not union compatible. Then they are called partially compatible. Only some of their attributes are union compatible. This type of attribute has a key for both the relation. Left outer join: R =>< S Keeps every tuple or R, if no matching found in S, then S have null values. Right outer join: R ><= S Full outer join R=><=S {If no match found set null value in the tuple} Outer union: Student (name, SSN, department, advisor) Faculty (name, SSN, department, Rank) Result (name, SSN, department, advisor, Rank) All the tuples of both the relation will appear in the result. 3.5. Tuple relational calculus Relational calculus is formal query language. When we write one declarative expression to specify a relation request and hence there is no description how to evaluate the query. Tuple relational calculus is based on specifying a number of tuple variables. Variables may take as its value any individual tuple from that relation. A simple tuple relational calculus queries is of the form {t | cond(t)}
  • 27. Result will display the set of all tuple t that satisfy cond(t). Ex: find all employees whose salary > 50,000. {t|employee(t) and t.salary>50,000} This notation resembles how attributes name are qualified with relation names. {t.fname,t.lname|employee(t) and t.salary>50,000} select t.fname, t.lname from employee as t where t.salary >50,000; 3.5.1. Expression and formulas in tuples calculus A general expression of the tuple relational calculus of the form {t1.a1, t2.a2.....tn.an | cond(t1.t2.t3.t4........tn)} Where t1.a1, t2.a2.....tn.an → tuple variable Ai is an attribute of the relation on which ti ranges. Cond → is a condition or formula Formula: Formula is made up of predicate calculus atoms which can be one of the followings. 1. An atom of the form R(ti) where R → relation name ti → tuple variable R(ti) → identifies the range of the tuple variable ti as the relation whose name is R 2. An atom of the form ti. A op tj.B where op → comparison operator set = { > < >= <= #} ti and tj are tuple variable A → attribute of the relation on which ti ranges B → attribute of the relation on which tj ranges 3. An atom of the form ti.A op c or c op tj.B where op → comparison operator ti and tj are tuple variable A → attribute of the relation on which ti ranges B → attribute of the relation on which tj ranges C → constant value A formula is made up of one or more atoms connected via the logical operator
  • 28. AND, OR, NOT, and is defined as follows 1. Every atom is a formula. 2. If f1 & f2 are formulas, then so are ( f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT (f2) 3. The truth values of these formulas are derived from their component formulas f1 and f2 as follows. a. (f1 AND f2) is true if both f1 and f2 are true. b. ( f1 OR f2)is false , if both f1 and f2 are false otherwise true c. NOT (f1) is true if f1 is false, it is false if f1 is true d. NOT (f2) is true if f2 is false, it is false if f2 is true 3.5.2. Existence and universal quantifier Two special symbols called quantifier can appear in formulas, there are 1. Universal quantifier 2. Existential quantifier Firstly we need to define the concept of free and bound tuples in formulas. Bound: a tuple variable t is bound if it is quantified meaning that it appear in an and Free: otherwise it is free. We can define the tuple variable in a formula as free and bound according to the following rule. 1. An occurrence of a tuple variable in a formula F that is an atom is free in F. 2. An occurrence of a tuple variable t is free or bound in formula made up of logical connectives. (f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT (f2) depending on whether it is free or bound in f1 and f2. a tuple variable may be free or bound either in f1 or in f2. 3. All free occurrence of a tuple variable t in f are bound in a formula f of the form. F.= ( f) or F.= (F) The tuple variables are quantifier specified in f. F1= d.dname=.research. F2= ( d.dname=t.DNO) F3= ( d.mgrssn=.12345677) Tuple variable d is free in both f1 & f2. where it is found to the universal quantifier in f3. t→ is bound to the quantifier in f2. 3.5.3. Rules for the definition of a formula
  • 29. 1. if f is formula then so is ( f) where t→ tuple variable the formula ( f) is true if the formula f evaluates to true some ( at least one) tuple assigned to free occurrence of t in f. otherwise ( f) is false. 2. if f is a formula , then so is (F) where t→ tuple variable The formula (F) is true, if formula f evaluates to true for every tuple (in the universe) assigned to free occurrence to t in f. otherwise (F) is false. Note: quantifier called the existential quantifier because a formula (f) is true , if there exist some tuples that makes f true. quantifier called the universal quantifier (F) is true for every possible tuple. 3.6. Transforming the universal and existential quantifier We now use some of the transformation from mathematical logic that states the universal and existential quantifier. It is possible to transform a universal quantifier into an existential quantifier and vise-versa. 3.6.1. Domain relational calculus There is another type of relational calculus called the domain relational calculus or simply domain calculus. The QBE language related to domain calculus. The specification of domain calculus was proposed after the development of QBE language. Domain calculus is differing from the tuple calculus in the type of variable used in the formula. An expression of the domain calculus is of the form {X1, x2.....xn+1.....xn+m) | cond(x1,x2.....xn+1.....xn+m} where
  • 30. X1, x2.....xn+1.....xn+m are domain variable that ranges domain of attributes. Cond= is the condition or formula of the domain relational calculus. A formula is made up of atoms. A formula can be one of the followings. 1. An atom of the form R(x1, x2....xj) R → name of relation of degree And each Xi 1<= I<=j is a domain variable 2. An atom of the form xi op xj where op → comparison operator in the set 3. An atom is of the form xi op c or c op xj where op → comparison operator in the set Xi and xj are domain variable c → constant value 4. DATABASE DESIGN Conceptual database design gives us a set of relational schemas and integrity constraints (ICs) that can be regarded as a good starting point for the final database design. This initial design must be refined by taking the ICs in to account more fully than is possible with just the ER model constructs and also by considering performance criteria and typical workloads. We concentrate on an important class of constraints called functional dependencies. Other kind of ICs, for example, multi-valued dependencies and join dependencies, also provide useful information. They can sometimes reveal redundancies that cannot be detected using functional dependencies alone. 4.1. Schema Refinement Redundant storage of information is the root cause of problems. Although decomposition can eliminate redundancy; it can lead to problems of its own and should be used with caution. 4.1.1. Guidelines for relation schema 1. Semantics of the attributes: every attributes in the relation must belong to the relation as we know; relation is a collection of attributes and having a meaning. Semantics means, how the attribute values in a tuple relate to one another. Example: (ename, ssn, bdate, address, dnumber) Each attribute give the information about employees. 2. Redundant information in the tuples: For the best use of free space, we disallow the redundant tuples from a relation. For this we use some anomalies.
  • 31. • Insert Anomalies • Deletion Anomalies • Modification Anomalies 3. Reducing null values in a tuple: Because this can waste space at the storage level and may create a problem with under standing the meaning of the attribute. Null values can have multiple interpretations. • Attributes values does not apply. • Attribute values are not known for a tuple. • Value is known but has not been recorded yet. 4. Spurious tuples: Spurious tuples are those tuples which give the wrong information. The spurious tuples are marked by asterisks (*). Example: Emp_loc (ename, plocation) Emp_proj (ssn, pno, hours, pname, plocation) 4.2. Functional Dependencies A functional dependency is denoted by X → Y between two sets of attributes X and Y. For any two tuples t1 and t2 in r T1[X]=t2[X] We must also have T1[Y]=t2[Y] This means that the value of Y component of a tuple is depend on, or determine by the value of X components or vise-versa. X → called the left hand side of the FD Y → called the right hand of the FD X functionally determines the Y in a relation R if and only if whenever two tuples of r(R) agree on their x value and agree on y values. 1. X is a candidate key. Because the key constraints imply that not two tuples will have the same value of X. 2. if X Y in R, this does not say whether or not Y → X in R.
  • 32. 4.2.1. Interference rules for Functional Dependencies Set of functional dependency is denoted by F that is specified on relational schema R. it is impossible to specify all possible functional dependencies that may hold. The set of all such dependency is called the closure of F and is denoted by F+. F={ssn → {ename,bdate,address,dnumber}, Dnumber → {dname,dmgrssn}} Ssn → {dname,dmgrssn}, Ssn → ssn, Dnumber → dname F+ is also known as infer dependency. To determine a systematic way to infer, we use inference rules. F=x→y is used to denote that the functional dependency. X→Y is inferred from the set of FD of F. 4.2.2. Axioms to check if FD holds
  • 33. 4.2.3. An Algorithm to Compute Attribute Closure X+ with respect to F Let X be a subset of the attributes of a relation R and F be the set of functional dependencies that hold for R. 1. Create a hyper graph in which the nodes are the attributes of the relation in question. 2. Create hyperlinks for all functional dependencies in F. 3. Mark all attributes belonging to X 4. Recursively continue marking unmarked attributes of the hyper graph that can be reached by a hyperlink with all ingoing edges being marked. Result: X+ is the set of attributes that have been marked by this process. 4.2.3.1. Hyper graph for F 4.3. NORMALIZATION 4.3.1. Basics of normal forms A set of functional dependencies is specified for each relation, the process which is top-down fashion and decomposing relation as necessary. Initially codd(1972) proposed 1NF,2NF,3NF. The stronger definition of 3NF is boyce-codd normal form proposed be Boyce and codd. All these normal forms are based on the FD of a relation. After some time, 4NF & 5NF were proposed based on the concept of multi-valued dependencies and join dependency. 4.3.1.1. 1NF (first normal form) It was defined to disallow multi-value and composite attribute and their combination. It states that the domain of an attribute must include only atomic
  • 34. value. Values of any attributes in a tuple must be a single value from the domain of that attribute. 4.3.1.2. 2NF (second normal form) Second normal form is based on the concept of full functional dependency. A FD (X→Y) if full functional dependency. If removal of any attribute A from x means that the dependency does not hold any more. i.e A x (X→{A}) does not determine Y. X→Y is partial dependency if some attribute removes from x.
  • 35. 4.3.1.3. 3NF (third normal form) It is based on the concept of transitive dependency. FD x→Y in a relation schema R is transitive dependency. There is a attribute z→ neither candidate key not a subset of any key of R. X→Z Z→Y Dependency hold. 4.3.1.4. Boyce-Codd normal form (BCNF) Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X → Y, X is a superkey - that is, X is either a candidate key or a superset thereof.
  • 36. Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not have multiple overlapping candidate keys is guaranteed to be in BCNF. Depending on what its functional dependencies are, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF. An example of a 3NF table that does not meet BCNF is: Today's Court Bookings Court Start Time End Time Rate Type 1 09:30 10:30 SAVER 1 11:00 12:00 SAVER 1 14:00 15:30 STANDARD 2 10:00 11:30 PREMIUM-B 2 11:30 13:30 PREMIUM-B 2 15:00 16:30 PREMIUM-A
  • 37. • Each row in the table represents a court booking at a tennis club that has one hard court (Court 1) and one grass court (Court 2) • A booking is defined by its Court and the period for which the Court is reserved • Additionally, each booking has a Rate Type associated with it. There are four distinct rate types: • SAVER, for Court 1 bookings made by members • STANDARD, for Court 1 bookings made by non-members • PREMIUM-A, for Court 2 bookings made by members • PREMIUM-B, for Court 2 bookings made by non-members 4.3.1.5. Algorithm for relational database For a database, a universal relation schema R=(A1,A2,……….An) that include all the attribute of the database. In this universal relation assumption, this states that every attribute name is unique. A set of functional dependency that should hold on the attribute or R specified by the database designers. Using functional dependency, the algorithms decompose the universal relation schema R into a set of relation schema D=(R1,R2,………….Rm) D= relational database schema (D is called a decomposition of R) We must sure that each attribute in R will appear in at least one relation schema Ri in the decomposition, so that no attribute are lost. I=1UmRi=R R = {R1UR2UR3………………….Rm} This is called attribute preservation condition of decomposition. 4.3.1.6. Decomposition and dependency preservation If each functional dependency X→Y specified in F appears directly in one of the relation schemas Ri in the decomposition D or could be inferred from the dependencies that appears in some Ri. This is the dependency preservation condition. We want to preserve the dependency because each dependency in F represents constraints on the database. That is needed to join two or more relations. Suppose that a relation R is given and a set of functional dependency F. F+ is the closure of F. Decomposition D = {R1, R2………………Rm} of R is dependency preservation with respect to F.
  • 38. 4.3.1.7. Decomposition and lossless (non-additive) joins Another property a decomposition D should process in the loss-less join or non- additive join property. Which ensure that no spurious tuples are generated, when a normal join operation is applied to the relation in the decomposition. The condition of no spurious tuples should hold on every legal relation state. Every relation satisfies the functional dependency in F. A decomposition D={R1,R2………………Rm} of R has the loss-less (non- additive) join property with respect to the set of dependency F of R. if every relation state r of R that satisfy F. where * is the natural join of all the relation in D. Word loss in loss-less refers to the loss of information, not loss of tuples. If decomposition does not have loss-less join property. We may get additional spurious tuples. 4.3.1.8. Multi-valued dependencies and fourth normal forms In this section we will study about multi-valued dependency. That is a consequence of first normal form (1NF), which allowed an attribute in a tuple to have a set of values. For multi-valued attribute, we repeat every value of one of the attribute with every value of the other attribute to keep the relation state consistent. This constraint is specified by a multi-valued dependency. An employee may work on several projects and several dependent. But project and dependent are independent to each other. To make the relation consistent, we must have a separate tuple to represent every combination of an employee’s dependent and employee project. This constraint is specified as multi-valued dependency.
  • 39. 4.3.1.8.1. Inference rules for functional and multi-valued dependency To develop inference rule that includes both FD’s and MVD’s, so that both types of constraints can be considered together. Inference rules IR1 through IR8 form a complete set for FD’s and MVD’s from a given set of dependency. R={A1,A2……………….Am} and X,Y,Z,W are subset of R. 4.3.1.8.2. Fourth normal forms A relation schema R is in 4NF respect to a set of dependency F (that includes FD and MVD) if, for every MVD’s X→→Y in F+, X is a super key for R.
  • 40. 4.3.1.9. Loss-less join decomposition 4.3.1.10. Join decomposition and fifth normal form Join dependency (JD), denoted by JD (R1, R2……………..Rn) specified on relation schema R, specifies constraints on state r of R. The constraints state that every legal state r of R should have a loss-less join decomposition into R1, R2……………..Rn. A join decomposition JD (R1, R2……………..Rn) specified on relation schema R is a trival JD if one of the relation schema Ri in JD (R1, R2……………..Rn) is equal to R. such a dependency is called trival dependency because it has the loss-less join property. For any relation state r of R and hence does not specify any constraints on R. 4.3.1.10.1. Fifth normal forms (Project join normal form) A relation schema R is in 5NF or project join normal form (PJNF) with respect to a set F of functional , multi-valued dependency JD (R1, R2……………..Rn) in F+ (i.e. implies by F), every Ri is a super key of R.
  • 41. Example: 4.4. Inclusive dependency Inclusion dependency was defined in order to formalize certain interrogational constraints. Example: Foreign key constraints cannot specify as FD’s or MVD’s. Because it relates attributes across relations. It can be specified as inclusive dependencies. Inclusive dependency is also used to represent the constraints between two relations. An inclusive dependency R.X<S.Y between two relation (set of attributes) X of relation R And Y of relation S x of R and y of S must have the same number of attributes. Example: If X = {A1, A2……………….An} And Y = {B1, B2…………………Bn} Where 1<=I<=n Ai corresponds to Bi. Inference rules for inclusive dependency. 1. IDIR1 reflexive rule R.X<R.X 2. IDIR2 Attribute correspondence
  • 42. If R.X<S.Y where X={A1,A2……………….An} And Y={B1,B2…………………Bn} Ai corresponds to Bi. R.Ai<S.Bi for 1<=i<=n 3. IDIR3 transitive rule If R.X<S.Y And S.Y<T.Z Then R.X<T.Z All the inclusion dependency represents referential integrity constraints. 5. TRANSACTION MANAGEMENT 5.1. Transaction Concept A transaction is a unit of program that access and possibly updates various data items. A transaction usually results from the execution of a user program written in high level language or data manipulation language or any other programming language. Example: SQL, COBOL, C, PASCAL And is determine by statements or system calls of the form begin transaction and end transaction. The transaction consist of all the operation between begin and end. To ensuring the integrity of data, we require that the database must maintain the following properties. 1. Atomicity: either all operation of the transaction is reflected property in database or none. 2. Consistency: execution of a transaction in isolation (i.e. no other transaction execution concurrently) preserve the consistency of the database. 3. Isolation: even through multiple transactions can execute concurrently. Ti & Tj set of transactions Ti→ execution finished Tj→ start execution 4. 4. Durability: after a transaction complete successfully, the changes it has made to the database persist, even if there is a system failure.
  • 43. These properties are called as ACID properties. Access to the database accomplished by the following two operations. 1. Read(X): which transfer the data item X from the database to local buffer belonging to the transaction that execute the read operation. 2. Write(X): that execute the write back to the database. Example: Ti that transfer $50 from account A to account B. Ti: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE (B) Initial value of A and b are 1000$ and 2000$. Suppose the system failure occurs after the write (A) and before .Then the account information A=950$ B=2000$ 5.2. Transaction state Compensating transaction: to undo the effect of committed transaction is to execute a compensating transaction. We establish a simple abstract transaction model transaction must be in the following states. Active: the initial state, the transaction stays in this state while executing. Partially committed: after the final statement has been executed. Failed: after the discovery that normal executing can no longer proceed. Aborted: after the transaction has been rolled back and the database has been restored the prior state. Committed: after successful completion. A transaction enters the failed state after the system determines that the transaction can no longer proceed with its normal execution. Example: Hardware or logical errors, such as a transaction must be rolled back, then entered the aborted state, at this point system has two options. 1. Restart the transaction: hardware or software error
  • 44. 2. Kill the transaction: internal logical error that can be correct only by rewriting or because due to the bad input. 5.3. Implementation of atomicity and durability Recovery management component of a database system implements the support of atomicity and durability. Shadow-database scheme: transaction that wants to update on the database, first create the complete copy of the database. All updates are done into the new copy of the database, leaving the original copy, called the shadow copy. If at any time, transaction has to be aborted, the new copy deleted. The old copy of the database is unaffected. If transaction completes, operating system asks write all the new copy on to the disk. In UNIX operating system FLUSH command is used. After the FLUSH has completed db_pointer, now points to the current copy of the database. 5.4. Concurrent Execution A database system must control the interaction among the concurrent transaction to ensure consistency of the database. In this section, we focus on the concept of concurrent execution. Example: Consider the set of transaction that access and updates the bank account. Let T1 and T2 be two transactions. T1: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE(B) T2: READ(A) TEMP:=A*0.1; A:=A-TEMP; WRITE(A) READ(B) B:=B+TEMP; WRITE(B) Initial value of A and b are 1000$ and 2000$. CASE1.
  • 45. If T1 followed by T2 A=855$ B=2145$ CASE2 If T2 followed by T1 A=850$ B=2150$ 5.5. Schedule Execution sequences are called as schedules that show the order of transaction execution. These schedules are called serial schedule. Each serial schedule consists of a sequence of instruction from the various transactions, where the instruction belonging to the single transaction appears together in the execution. If two transactions are running concurrently, the CPU switches between the two transactions or shared among all the transaction. Final value of A and B are A=855$, B=2145$ Some of the schedules leave the database in inconsistence state. Consider the example:
  • 46. Final value of A and B are A=900$, B=2150$. Here we gained 50$. 5.6. Serializability The database system must control execution of concurrent transaction to ensure that the database system remains consistent. Then we first understand which will ensure consistency and which schedule will not. Generally transaction performs two operations. I. Read operation II. Write operation A transaction performs this sequence of operations on the copy of Q that is residing in the local buffer of the transaction. Here we will discuss different forms of schedules. I. Conflict Serializability II. View Serializability 5.6.1. Conflict Serializability Consider a schedule S that consist two consecutive transactions Ti and Tj. Where Ii and Ij are instructions respectively (I≠j) 1. If Ii and Ij refer to the different data items then we can swap Ii and Ij without affecting the result of any instruction in the schedule.
  • 47. 2. If Ii and Ij refer to the same data items then the order of two steps may matters. Here we are dealing with two operation read operation and write operation. a. Ii=READ(Q), Ij=READ(Q) order does not matter. Because the same value of Q is read by both ( Ti and Tj) b. Ii=READ(Q), Ij=WRITE(Q) order will matter. c. Ii=WRITE(Q), Ij=READ(Q) order will matter. d. Ii=WRITE(Q), Ij=WRITE(Q) Since both instructions are write operation. The order of this instruction does not affect Ti and Tj. But the value obtained by the next read (Q) instruction of S is affected. We sat that Ii and Ij conflict if they are operation by different transaction on the same data item, and at least one of these instructions is a write operation. Serial schedule is defined as the all the instruction of any transaction executes together.
  • 48. If a schedule S can be transformed into a schedule S’ by a series of swaps of no- conflicting instruction, we say that S and s’ are conflict equivalent. The concepts of conflict equivalent leads to the concepts of conflict Serializability, we say that a schedule S is conflict Serializable if it is conflict equivalent to a serial schedule. Such analysis is hard to implement and computationally expensive. We will consider one such definition. 5.6.2. View Serializability It is similar to conflict Serializability and based on the only read and write operations of transactions. Consider the two schedule S and S’, where the same set of transaction participates in both schedule. The schedule and S’ are said to view Serializability, is they satisfy the following three conditions: 1. 1.for each data item Q, if transaction Ti reads the initial value of Q in schedule S, then the transaction Ti must be in schedule S’, also read the initial value of Q. 2. for each data item Q, if transaction Ti executes the read (Q) in schedule S and that the value was produced by transaction Tj, then transaction Ti must be in schedule S’ also read the value of Q that was produced by the Tj. 3. for each data items Q, the transaction that performs the final write (Q) operation in schedule S must performs the final write(Q) operation in schedule S’. 5.7. Recoverability Still we are discussing about which schedule will ensure the consistency of the database and which will not. With assuming that there is no transaction failure now, we address the effect of transaction failure during concurrent execution. Transaction Ti→ that fails, for what ever reason and we need to undo the effect of Ti to ensure atomicity property. In a system that allows concurrent execution. Tj that is dependent upon on Ti.( Tj reads the data item written by the Ti) also aborted. That’s why we need to place some restrictions on that schedules. 5.7.1. Recoverable schedule
  • 49. Most database system requires that all schedules be recoverable. A recoverable schedule is one where, for each pairs of transaction Ti and Tj such that Tj reads a data item previously written by Ti. The commit operation of Ti appears before the commit operation of Tj. 5.7.2. Cascade less schedule Consider the example T10 writes a value that is read by T11. Suppose T10 fails, T10 must be rolled back. Since T11 is dependent on T10, T11 and T12. Then all the remaining transaction must be rolled back. The phenomenon in which a single transaction failure leads to a series of transaction rollbacks is called Cascading roll back. It is desirable that cascading roll backing should not be occurs in a schedule. Such schedules are called as cascade less schedule. For every pairs of transactions, such as Ti and Tj, where Tj reads the data item written by the Ti, the execution of Ti must finish before Tj. Then it is easy to identify that recoverable schedule is cascade less schedule. 5.8. Testing for Serializability Every schedule must be Serializable, we first understand to determine a given particulars schedule S is Serializable or not. Let S be a schedule. We construct a directed graph (precedence graph) from S. G= (V, E) Where V→ set of vertices E→ set of edges Vertices: consists all the transactions that are participating in a schedule. Edges: Ti→ Tj for which one of the following condition hold. 1. Ti executes write(Q) before Tj executes read(Q) 2. Ti executes read(Q) before Tj executes write(Q) 3. Ti executes write(Q) before Tj executes write(Q) For any particular schedule S T1----------→ T2 All the instructions of T1 executes before the first instruction of T2.
  • 50. 5.9. Precedence graph By using precedence graph scheme it is not conflict serializable. But it is view serializable. There is an edge T4→ T3 are called useless writes. To test view serializability, we develop a scheme for deciding whether an edge is need to be inserted in a precedence graph. Schedule S Tj reads a value written by Ti { Ti→ Tj} If schedule S is view serializable then any schedule S’ i.e equivalent to schedule S. Tk executes write(Q) Then in schedule S’ Tk→ Ti Either Tj→ Tk It can not appear between Ti and Tj.
  • 51. To test view serializability, we need to extend the precedence graph to include labeled edges. This types of graph termed as label precedence graph. Rules for inserting labled edges in precedence graph: Let us consider a schedule S having transaction s (T1, T2………….Tn) Let Tb and Tf two transactions Tb issues write(Q) for each Q accessed in S Tf issues read(Q) for each Q accessed in S Now, we construct a new schedule S’ from S by inserting Tb at the beginning of S Tf at the end of S. We construct the labeled precedence graph for schedule S’ as follows. 1. Add an edge Ti→ Tj. If Tj reads the value of a data item Q written by Ti. 2. Remove all edges incident on useless transactions. A transaction Ti is useless if there exsist no path in the precedence graph, from Ti→ Tf. 3. for each data item Q such that Tj reads a value of Q written by Ti and Tk executes write(Q) and Tk≠Tb, do the followings: a. Ti=Tb and Tj ≠ Tf then insert an edge Tj→ Tk. b. If Ti≠ Tb and Tj=Tf then insert an edge Tk→ Ti. c. If Ti≠ Tb and Tj≠Tf then insert an edge Tk→ Ti. And Tj→ Tk in the labled precedence graph. Where P= unique number. 6. CONCURRENCY CONTROL When several transactions executes concurrently in the database, the isolation property may no longer preserved. It is necessary for the system to control the interaction among concurrent transaction. These types of controls are termed as concurrency control schemes. 6.1. Lock based protocols One way to ensure the serializability is to require that access to data item be done in a mutually exclusive manner. I.e while one transaction is accessing a data item, no other transaction can modify that data item. One way to implement this requirement is to allow a transaction to access a data item if it is currently holding a lock on that data item. 6.1.1. Locks There are various modes in which a data item may be locked. Share mode: if a transaction Ti has share mode lock (denoted by S) on the data item Q, then Ti can read but can not write Q. Exclusive mode: (denoted by X) then Ti can perform both read and write on Q. Example:
  • 53. This schedule will display the ( A+B) as 250$.
  • 54. This situation is called deadlock. When deadlock occurs, the system must roll back on of the two transactions. The data item that was locked by that transaction is unlocked. These data items were available to other transactions. 6.1.2. Granting of locks When a transaction requests a lock on the data item in particular mode, and no other transaction has a lock on the same data item in a conflicting mode. The lock can be granted. Suppose Transaction T2 has a lock and T1 request (has to wait) for T2 release the exclusive mode lock. T1 will wait T2→ lock-S(Q)( has) T1→ lock-X(Q) (wait) T3→ lock-S(Q)( request) T4→ lock-S(Q)( request) T1 is still waiting. This situation is called as starvation where a particular transaction continuously waiting for a particular lock on the same data item. 6.1.3. Avoiding starvation of transaction by granting locks When a transaction Ti request a lock on data item Q in particular mode M, the lock is granted provided that 1. There is no other transaction holding a lock on Q in a mode that conflict with M. 2. There is no other transaction that is waiting for a lock on Q and that made its lock request before Ti. 6.2. Two phase locking protocol One protocol that ensures serializability is the two phase locking protocol. This protocol requires that each transaction issue locks and unlock request in two phases. 1. Growing phase: a transaction may obtain locks but not release any locks. 2. Shrinking phase: a transaction may release locks but may not obtain any new locks. When a transaction has obtained its final locks is called the lock point of the transaction. Cascading rollbacks can be avoided by a modification of two phase locking called the strict two-phase locking protocol. This protocol requires that all
  • 55. exclusive locks taken by the transaction must be held until that transaction commits. This requirement ensures that any data item written by an uncommitted transaction are locked in exclusive mode until the transaction commits, preventing any other transaction from reading the data. Another type of protocols is the rigorous two-phase locking protocol. Which requires all lock to be held until the transaction commits. It can be easily verified that transaction can be serialized. 6.3. Graph based protocol If we wish to develop protocol that is not two-phase, we need additional information on how each transaction will access the database. In this model we have prior knowledge about the order in which the database item will be accessed. To acquire such prior knowledge a particular order on the set D = (d1, d2……………..dn) of all data items. If di→ dj. Then any transaction di and dj must access di before accessing dj. This ordering can be shown as a directed acyclic graph, called database graph. We restricted to employee only exclusive locks. In the tree protocol, the only lock allowed is lock-X. Each transaction action Ti can lock data item at most once and must follows the rules. a. The first lock by Tj may be on any data item. b. Subsequently, a data item Q can be locked by Ti only if the parent of Q is currently locked by Ti. c. Data item may be unlocked at any time. d. A data item that has been locked and unlocked by Ti can not be subsequently be relocked by Ti. Advantages: 1. Unlocking may access earlier that leads to shorter waiting time and to increase concurrency. 2. Protocol is deadlock free, no roll backs are required. Disadvantages: 1. 1. Locking results in increased locking 2. 2. Additional waiting time 3. 3. Potential decrease in concurrency 6.4. Time-stamp based protocol These types of locking protocol, we use for ordering between every pairs of conflicting transaction is determines at execution time. Time-stamp:
  • 56. With each transaction Ti in the system, we associated a unique fixed time stamp denoted by TS (Ti). This time stamp assigned by the database system before the transaction Ti starts execution. If TS(Ti) → T0 transaction Ti New entered transaction Ti TS(Tj)There are two simple methods for implementing this scheme. 1. Use the value of system clock as the time stamp. That’s a transaction time stamp is equal to the value of the clock when the transaction enters the system. 2. Use a logical counter that is incremented after a new time-stamp has been assigned. Transaction time-stamp is equal to the value of the counter. 6.5. Validation based protocol In some cases, where the majority of the transactions are read-only transaction, rate of conflicts among transaction may be low. But we do not know in advance which transaction will be involved in a conflict. To gain that we need to scheme for monitoring the system, we assume that each transaction Ti executes in two phases. 1. Read phase: during this phase, the execution of transaction Ti takes place, the value of the various data item are read and are stored in variable local to Ti. All write operations are performed on temporary local variable, without updating the actual database. 2. Validation phase: transaction Ti performs a validation test to determine whether it can copy to the database. The temporary local variable that holds the result of write operation without causing a violation of serializability. 3. if transaction Ti succeeds in validation ( step 2). Then the actual updates are applied to the database. Otherwise Ti is roll back. a. Start (Ti), the time when Ti started its execution. b. Validation (Ti), the time when Ti finished read phase and started its validation phase. c. Finish (Ti), the time, when Ti finished its write phase. 6.6. Recovery system 6.6.1. Failure Classification There are various types of failure that occurs in the system. Each of which deals with in a different manners. Simple failure: does not loss of information in a system.
  • 57. Difficult failure: of information in a system. Here we consider only the following types of failure: 6.6.1.1. Transaction failure There are two types of error that may cause transaction to fail. Logical error: transaction can no longer proceed with its normal execution. Due to such as bad input, data not found, overflow or resource limit exceeded. System errors: The system has entered in undesirable state ( deadlock) as a result of which a transaction cannot continue with its normal execution. For this transaction re-execute after. System crash: Such as bug in the database software, operating system fails, that causes loss of contents of volatile storage. Disk failure: Disk blocks loses its contents, either head crash. To recover this types of failure , tapes are used. 6.6.2. Log based recovery: The most widely structure for recording database modification is the log. The log is a sequence of log records and maintains a record of all the update in the database. Log records having the following fields: Transaction identifier: It is a unique identifier of the transaction that performs write operation. Data item identifier: It is identifier of the data item. Basically it is the location of the data item on the disk. Old value: Value of the data item prior to the write operation. New value: Value of the data item will have after the write operation. Log record exist to record significant events during transaction processing. < Ti, start> transaction Ti has started. <Ti,Xj,V1,V2> transaction Ti performed write operation on the data item Xj, has the value V1 before the write , will has V2 after the write. <Ti, commit> transaction Ti has committed. <Ti, abort> transaction ti has aborted.
  • 58. 6.7. Deferred Database Modification In this scheme, when a transaction is partially commits, the information on the log associated with the transaction is used in executing the deferred writes. If the system crashes before the transaction completes. Its execution or if the transaction aborts then the information on the log is simply ignored. T0: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE(B) T1: READ(C) C:=C-100; WRITE(C) 6.8. Immediate Database Modification The immediate update technique allows database modification to be output to the database while the transaction still in the active state. Database modifications written by active transaction are called uncommitted modification. In the event of a crash or transaction failure, the system must use the old value field of the log records to restore the modified data item. < T0, start> < T0,A,1000,950>
  • 59. < T0,B,2000,2050> < T0,commit> < T1, start> < T1,C,700,600> < T1,commit> 7. CENTRALIZED AND DISTRIBUTED DATABASE In the traditional enterprise computing model, an Information Systems department maintains control of a centralized corporate database system. Mainframe computers, usually located at corporate headquarters, provide the required performance levels. Remote sites access the corporate database through wide-area networks (WANs) using applications provided by the Information Systems department. Changes in the corporate environment toward decentralized operations have prompted organizations to move toward distributed database systems that complement the new decentralized organization. Today’s global enterprise may have many local-area networks (LANs) joined with a WAN, as well as additional data servers and applications on the LANs. Client applications at the sites need to access data locally through the LAN or remotely through the WAN. For example, a client in Tokyo might locally access a table stored on the Tokyo data server or remotely access a table stored on the New York data server. Both centralized and distributed database systems must deal with the problems associated with remote access: • Network response slows when WAN traffic is heavy. For example, a mission-critical transaction-processing application may be adversely affected when a decision-support application requests a large number of rows. • A centralized data server can become a bottleneck as a large user community contends for data server access. • Data is unavailable when a failure occurs on the network. 7.1. Distributed Database System A distributed database system is a collection of data that belongs logically to the same system but is physically spread over the sites of a computer network. 7.2. Some advantages of the DDBMS are as follows: 1. Distributed nature of some database application: some database application arte naturally distributed over the different sites.
  • 60. 2. Increased reliability and availability: there are two most common advantages for any database. Reliability is broadly defined as the probability that a system is up at a particular moments. Availability is the probability that the system is continuously available during a time interval. 3. Allowing data sharing while maintaining some measures of local controls: it is possible to control the data & software locally at each site. However the certain data can be accessed by users at other remote site through the DBMS software. This allows the controlled sharing of data through out the distributed system. 4. Improved performance: when a large data is distributed over the multiple sites, smaller data base exist at each site. As a result, local queries & transaction accessing data at a single site have better performance because of the smaller local database. If all the transaction are submitted to a single centralized database, than the performance will be decreased. 7.3. Some additional properties: 1. The ability to access remote sites and transmit queries and data among the various sites via a communication network. 2. The ability to decide on which copy of a replicated data item to access. 3. The ability to maintain the consistency of copies of a replicated data item. 4. The ability to recover from individual site crashes and from new types of failure such as the failure of the communication links. 7.4. Physical hardware level The following main factors distinguish a DDBMS from a centralized system: 1. There are multiple computers called site or nodes. 2. These sites must be connected by some types of communication network to transmit data and command among the site. The site may be within the same building or group of adjacent building via local area network or they may be geographically distributed over the large distance and connected via a long haul network. Local area network typically uses cables. Whereas long haul network use telephone lines or satellites it is also possible to use a combination of the two types of network. Networks may have different topologies that define the different communication among sites.
  • 61. 7.5. Client Server Architecture The client server architecture has been developed to deal with new computing environment in which a large no. of personal computers, workstations, file servers, peripherals and others equipments are connected together via a network. The idea is to define specialized covers with specific functionalities. The instruction between client and server might proceed as follows during processing of an SQL query. 1. The client passes a users query and decomposition it into a number of independent site queries. Each site query is sent to the appropriate receiver site. 2. Each server processes the local query and sends the resulting relation to the client site. 3. The client site combines the result of the sub queries to improve the result of the originally submitted query. In this approach SQL server has called a database processor (DP) or a back-end machine whereas the client has been called as application processor (AP) or front-end machine. The DDBMS, it is to divide the software modules into the three levels. 1. The server software is responsible for local data management at a site. 2. The client software is responsible for most of the distribution function. It accesses the data distribution information from the DDBMS catalog and processes all request that require access to more than one site. 3. The communication software provides the communication primitives that are used by the client to transmit command and data among the various sites as needed.
  • 62. 7.6. Data fragmentation If relation r is fragmented, r is divided into a number of fragments r1, r2……………rn. These fragments contains the sufficient information to allow reconstruction of the original relation r. this reconstruction can take place through the application of either the union operation or special types of join operation on the various fragments. There are three different types of schemes for fragmenting a relation: I. Horizontal fragmentation II. Vertical fragmentation III. Mixed fragmentation 7.6.1. Horizontal fragmentation In this each tuple of r is fragment into one or more fragments horizontally. A relation r is partitioned into a number of subsets r1, r2……………rn. Each tuple of the relation r must belong to at least one of the fragments so that the original relation can be reconstructed. These fragments can be defined as a selection operation. For reconstruction we uses union operation R=r1Ur2U……………….rn 7.6.2. Vertical fragmentation In this each column of r is fragment into one or more fragments vertically. Vertical fragmentation r(R) involves the subset of attributes R1,R2…………..Rn of the schema R such that R=R1UR2U……………….Rn Each fragments of r is defined by project operation For reconstruction we uses join operation R=r1×r2×……………….rn 7.6.3. Mixed fragmentation Either the horizontal fragments or vertical fragments. A relation r is divided into a number of fragments R1,R2…………..Rn. each fragments is obtained as the result of applying either the horizontal fragmentation or vertical fragmentation scheme on relation r or a fragments of r which was obtained previously. 7.7. Data Replication If r relation is replicated, a copy of relation r is stored in two or more sites. If we have full replication in which a copy is stored in every site in the system. Availability: If one site fails then the relation may found on the other site. This system may continue the process.
  • 63. Increased parallelism: where the majority of access to the relation r result in only the reading the relation. Then the several sites can process the queries involving r in parallel. Then there is the chance that needed data is found when the transaction is executing. Increased overhead on update: The system must ensure that all replicas of a relation r are consistent; otherwise error ness computations may result. Whenever r is updated, the update must be propagating to all sites containing replicas. 7.8. Deadlock handling A system is in deadlock state if there exist a set of transaction such that every transaction in a set is waiting for the transaction in the set. Suppose a set of waiting transaction { T0,T1………Tn} T0 is waiting for a data item held by T1 . . . . Tn is waiting for a data item held by T0 No any transaction can make progress in this situation. There are two principal methods for dealing with deadlock problems: a. Deadlock prevention: this protocol ensures that system will never enter in deadlock state. b. Deadlock detection and recovery: we allow a system to enter in deadlock state and then they try to recover. 7.8.1. Deadlock prevention There are two approaches to deadlock prevention Approach1: i. No cyclic waits can occurs ii. All locks to be acquired together. Approach2: i. This approach is closer to deadlock recovery ii. We rollback transactions instead of waiting for deadlock under the first approach 7.8.1.1. The first approach Each transaction locks all the data item before it begin it execution. Disadvantages: i. It is often hard to predict, before the transaction begins, what data item need to be locked.
  • 64. ii. Data item utilization will be very low, since many of data items may be locked but unused for a long time. 7.8.1.2. The second approach For preventing the deadlock is to use preemption method and transaction rollback. In preemption: T2 request lock by T1 The lock granted to T1 may be preempted by roll backing back of T1 and granting of lock to T2. To control preemption we assign a unique time stamp to each transaction. The system uses these time stamp only to decide whether a transaction should wait or roll back. Two different deadlock prevention schemes are proposed: 1. wait –die: this scheme is based on a non-preemption techniques. Ti request a data item held by Tj ti is allowed to wait if Ti( time stamp)< Tj ( time stamp) 2. wound-wait: preemption techniques and is a counter part to the wait-die scheme. Ti request a data item held by tj Ti is allow to wait only if Ti( time stamp) >Tj ( time stamp) 7.8.1.3. Time –out based scheme Another simple technique is based on the lock time outs. In this approach, the transaction that has requested a lock waits for at most a specified amount of time. If the lock has not been granted within that time, the transaction is said to be time out and it rolled back itself and restarts. Disadvantages: i. One or more transactions involved in deadlock. ii. Short a wait result in transaction rollback, even there is no deadlock. iii. Leading to wasted resources. iv. Starvation is also possibility with this scheme. 7.8.2. Deadlock detection and recovery i. If a system does not employ that ensures deadlock freedom, then a detection & recovery scheme must be used. ii. In this schemes system determines iii. Whether a deadlock has occurred, if one has system must attempt to recover from the deadlock.
  • 65. To do this, system must i. Maintains information about the current allocation of data item to transaction as well as resulting data item. ii. Develop an algorithm that uses this information to determine whether the system has entered a deadlock state. iii. Recover from deadlock, if deadlock exists. 7.8.2.1. Deadlock detection To describe deadlock we use directed graph called wait-for graph. Graph consist G=(V,E) V→set of vertices (all the transaction in the system) E→set of edges 7.8.2.1.1. Directed graph Ti→Tj Ti is waiting for transaction Tj to release data item that it needs. A deadlock exists in a system if and only if the wait –for graph contains a cycle. Each transaction in the cycle is said to be deadlocked. To detect deadlock, the system maintains the wait-for graph and there search the cycle in the graph.
  • 66. 7.8.2.2. Recovery from the deadlock When system determines that a deadlock exists. The system must recover from deadlock. The most common solution is to roll back one or more transaction to break deadlock. The following actions need to be taken: 1. Select a victim: which one transaction is to be rollback. a. How long the transaction has completed the task. b. How many data items the transaction has used. c. How many more data item the transaction needs for it to complete. d. How many transactions will be involved in the rollback 2. Rollback: once we have decided that the particular transaction must be roll back. We must determine how far this transaction should be rolled back. But for these methods, system requires to maintain the information about the state of all running transaction. 3. Starvation: when a system determines that a particular transaction never completes its designated task. This situation is called starvation. 8. SQL (STRUCTURED QUERY LANGUAGE) SQL (Structured Query Language) is a database sublanguage for querying and modifying relational databases. It was developed by IBM Research in the mid 70's and standardized by ANSI in 1986. The Relational Model defines two root languages for accessing a relational database -- Relational Algebra and Relational Calculus. Relational Algebra is a low-level, operator-oriented language. Creating a query in Relational Algebra involves combining relational operators using algebraic notation. Relational Calculus is a high-level, declarative language. Creating a query in Relational Calculus involves describing what results are desired. SQL is a version of Relational Calculus. The basic structure in SQL is the statement. Semicolons separate multiple SQL statements. 8.1. DDL Statements DDL stands for data definition language. DDL statements are SQL Statements that define or alter a data structure such as a table. DDL statements are used to define the database structure or schema. Some examples: • CREATE - to create objects in the database