What is Database?
• The database is a collection of inter-related data which is used to
retrieve, insert and delete the data efficiently.
• It is also used to organize the data in the form of a table, schema,
views, and reports, etc.
• For example: The University Database (admin, staff, students and
faculty etc.)
What Is a DBMS?
• A database is a collection of data elements (facts) stored in a
computer in a systematic way, such that a computer program
can consult it to answer questions.
• The answers to those questions become information that
can be used to make decisions that may not be made with
the data elements alone.
• The computer program used to manage and query a
database is known as a database management system
(DBMS)
• Database management system is software that is used to
manage the database.
• For example: MySQL, Oracle,
Using DBMS users can perform following tasks:
1.Data Definition: In the context of SQL, data definition or data
description language (DDL) is a syntax for creating database objects such as tables, indices, and users.
• DDL statements are similar to a computer programming language for defining data structures, especially
database schemas.
2.Updation: is a computer programming language used for adding (inserting), deleting, and modifying (updating)
data in a database.
A DML is often a sublanguage of a broader database language such as SQL, with the DML comprising some of the
operators in the language
3.Data Retrieval:It is used to retrieve the data from the database which can be used by applications for various
purposes.
4.User Administration:It is used for registering and monitoring users, maintain data integrity, enforcing data
security, dealing with concurrency control, monitoring performance and recovering information corrupted by
unexpected failure.
What is File? Drawbacks of using file systems to store data
• File is a collection of related data stored in secondary memory like in ‘HDD’
• Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation
• Multiple files and formats
• Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried” in program code
rather than being stated explicitly
• Hard to add new constraints or change existing ones
Drawbacks of using file systems to store data (Cont.)
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial
updates carried out
• Example: Transfer of funds from one account to another should
either complete or not happen at all
• Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
• Example: Two people reading a balance (say 100) and updating it by withdrawing
money (say 50 each) at the same time
• Security problems
• Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Why Use a DBMS?
• Data independence and efficient access.
• Reduced application development time.
• Data integrity and security.
• Uniform data administration.
• Concurrent access, recovery from
crashes.
Data Models
• A data model is a collection of high level concepts for describing data.
• A schema is a description of a particular collection of data, using the a given
data model.
• The relational model of data is the most widely used model today.
• Main concept: relation, basically a table with rows and columns. A set of
records
• Every relation has a schema, which describes the columns, or fields.
Database model defines the logical design and structure of a database and
defines how data will be stored, accessed and updated in a database
management system.
Network Model
• This is an extension of the Hierarchical model.
• like a graph
• more than one parent node.
Entity-relationship Model
• In this database model, relationships are created by
dividing object of interest into entity and its
characteristics into attributes.
Entity-relationship Model
• In this database model, relationships are created by
dividing object of interest into entity and its
characteristics into attributes.
Difference between Hierarchical, Network and
Relational Data Model
Hierarchical Data Model Network Data Model Relational Data Model
In this model, to store
data hierarchy method is
used. It is the oldest
method and not in use
today.
It organizes records to
one another through
links or pointers.
It organizes records in
the form of table and
relationship between
tables are set using
common fields.
To organize records, it
uses tree structure.
It organizes records in
the form of directed
graphs.
It organizes records in
the form of tables.
It implements 1:1 and
1:n relations.
In addition to 1:1 and 1:n
it also implements many
to many relationships.
In addition to 1:1 and 1:n
it also implements many
to many relationships.
The three-schema architecture is as follows:
• The internal level has an internal schema which
describes the physical storage structure of the
database.
Storage space allocations.
For Example: B-Trees, Hashing etc.
• Conceptual level is also known as logical level.
• The conceptual schema describes the structure
of the whole database.
• At the external level, a database contains several
schemas that sometimes called as subschema. The
subschema is used to describe the different view of the
database.
•An external schema is also known as view schema.
View of Data
• A database system is a collection of interrelated data and a set of programs that allow
users to access and modify these data. A major purpose of a database system is to
provide users with an abstract view of the data. That is, the system hides certain details
of how the data are stored and maintained.
• Data Abstraction
interactions with the system:
• Physical level. The lowest level of abstraction describes how the data are actually stored. The physical
level describes complex low-level data structures in detail.
• Logical level. The next-higher level of abstraction describes what data are stored in the database, and
what relationships exist among those data. The logical level thus describes the entire database in terms of a
small number of relatively simple structures. This is referred to as physical data independence. Database
administrators, who must decide what information to keep in the database, use the logical level of
abstraction.
• View level. The highest level of abstraction describes only part of the entire database. Even though the
logical level uses simpler structures, complexity remains because of the variety of information stored in a
large database.
The view level of abstraction exists to simplify their interaction with the system. The system may provide
many views for the same database.
The three levels of data abstraction.
You can get the complex data structure details at this level.
what data stored in database
user interaction with database system.
Eg. Hiding employee salary data.
Data Independence
• Data independence can be explained using the three-schema architecture.
• Data independence refers characteristic of being able to modify the schema at one level of the database
system without altering the schema at the next higher level.
There are two types of data independence:
1. Logical Data Independence
•Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
•Logical data independence is used to separate the external level from the conceptual view.
•If we do any changes in the conceptual view of the data, then the user view of the data would not be
affected.
•Logical data independence occurs at the user interface level.
2. Physical Data Independence
•Physical data independence can be defined as the capacity to change the internal schema without
having to change the conceptual schema.
•If we do any changes in the storage size of the database system server, then the Conceptual structure
of the database will not be affected.
•Physical data independence is used to separate conceptual levels from the internal levels.
•Physical data independence occurs at the logical interface level.
Database Users and Administrators
People who work with a database can be categorized as database users
or database administrators.
Database users
• Application Programmers - They are the developers who interact with the database by means of DML
queries. These DML queries are written in the application programs like C, C++, JAVA, Pascal etc. These
queries are converted into object code to communicate with the database.
• For example, writing a C program to generate the report of employees who are working in particular
department will involve a query to fetch the data from database. It will include a embedded SQL query
in the C Program.
• Sophisticated Users - They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request the database.
They directly interact with the database by means of query language like SQL. These users will be
scientists, engineers, analysts who thoroughly study SQL and DBMS to apply the concepts in their
requirement. In short, we can say this category includes designers and developers of DBMS and SQL.
• Specialized Users - These are also sophisticated users, but they write special database application
programs. They are the developers who develop the complex programs to the requirement.
• Stand-alone Users - These users will have stand –alone database for their personal use. These kinds of
database will have readymade database packages which will have menus and graphical interfaces.
• Native Users - these are the users who use the existing application to interact with the database. For
example, online library system, ticket booking systems, ATMs etc. which has existing application and
users use them to interact with the database to fulfil their requests.
Database System Structure
• Query Processor translates statements in a query language into low
level instructions the database manager understands. (May also
attempt to find an equivalent but more efficient form.) The Query
Processor simplifies and facilitates access to data.
• The DDL interpreter interprets DDL statements and records the
definition in the data dictionary.
• The DML compiler translates DML statements in a query language
into an evaluation plan consisting of low-level instructions that the
query evaluation engine understands.
• The DML compiler also performs query optimization, which is it picks
the lowest cost evaluation plan from among the alternatives.
• Query evaluation engine executes low level instructions generated by
the DML compiler.
Database System Structure
• several data structures are required for physical system
implementation:
• Data Files: store the database itself.
• Data Dictionary: stores information about the structure of the
database. It is used heavily. Great emphasis should be placed on
developing a good design and efficient implementation of the
dictionary.
• Indices: provide fast access to data items holding particular values.
Storage Manager
• The storage manager is important because database typically require
a large amount of storage space. So it is very important efficient use
of storage, and to minimize the movement of data to and from disk
DBMS vs RDBMS
DBMS stores data as file. RDBMS stores data in tabular form.
Data elements need to access individually. Multiple data elements can be accessed at the same time.
No relationship between data. Data is stored in the form of tables which are related to each other.
Normalization is not present. Normalization is present.
DBMS does not support distributed database. RDBMS supports distributed database.
It stores data in either a navigational or hierarchical form.
It uses a tabular structure where the headers are the column names,
and the rows contain corresponding values.
It deals with small quantity of data. It deals with large amount of data.
Data redundancy is common in this model. Keys and indexes do not allow Data redundancy.
It is used for small organization and deal with small data. It is used to handle large amount of data.
It supports single user. It supports multiple users.
Data fetching is slower for the large amount of data. Data fetching is fast because of relational approach.
The data in a DBMS is subject to low security levels with regards to
data manipulation.
There exists multiple levels of data security in a RDBMS.
Low software and hardware necessities. Higher software and hardware necessities.
Examples: XML, Window Registry, etc. MySQL, PostgreSQL, SQL Server, Oracle, Microsoft Access etc.
Advantages of DBMS
• Control redundancy
• Consistency
• Integrity
• Security
• Concurrency control
• Backup & recovery
• Data standard
• More information
• Data sharing & conflict control
• Productivity & accessibility
• Economy of scale
• Maintenance