Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx

Databases and Database Users, Characteristics of the Database Approach,
Advantages of Using the DBMS Approach, When Not to Use a DBMS, Data
Models, Schemas, and Instances, Three- Schema Architecture and Data
Independence, Database Languages and Interfaces, The Database System
Environment.
Module I: Introduction to Database System Concepts and
Architecture

Databases and Database Users
A database is a structured collection of data stored and accessed electronically.
Database users include:
◦ Database administrators (DBAs) – manage the database system.
◦ Database designers – define the database structure.
◦ Application developers – write application programs to access data.
◦ End users – interact with the database through applications (casual, naive,
sophisticated).
Users interact via queries, transactions, or programmatic interfaces.

Characteristics of the Database Approach
•Data Abstraction: Separation between logical and physical views of data.
•Data Independence: Applications remain unaffected by changes in data structure.
•Multi-user Access Control: Supports concurrent access with data consistency.
•Minimized Data Redundancy: Centralized storage reduces duplicate data.
•Data Integrity and Security: Constraints and access permissions ensure quality
and protection.
•Backup and Recovery: Automatic tools for data recovery in case of failure.
•Centralized Control: Managed by a DBMS to ensure efficient use and
maintenance.

Advantages of Using the DBMS Approach
1. Efficient Data Management: Easy access, updates, and storage.
2. Reduced Redundancy and Inconsistency: Central control ensures consistency.
3. Improved Data Sharing: Allows multiple users to access the same data.
4. Better Security: Controlled user access through authentication.
5. Data Integrity Enforcement: Built-in constraints and validation rules.
6. Data Independence: Shields applications from physical changes in storage.
7. Backup and Recovery: Automatic mechanisms to prevent data loss.

When Not to Use a DBMS
•Simple Applications: Small, one-user apps might not need the complexity of a DBMS.
•Real-time Systems: Where extreme performance is needed and DBMS latency is not
acceptable.
•Embedded or Device Software: Lightweight file systems may be more suitable.
•Budget Constraints: When the cost of licensing and maintaining a DBMS is unjustifiable.
•Short-term Projects: Where setting up a DBMS adds unnecessary overhead.

Data Models, Schemas, and Instances
Data Models define how data is logically structured (e.g., relational, hierarchical,
object-oriented).
Schema: The database’s overall logical design (metadata).
◦ Defines tables, fields, data types, relationships.
Instance: The actual data in the database at a given time (snapshot).
Schema is fixed unless redesigned; instance changes frequently.

Three-Schema Architecture and Data Independence
•Internal Level: Describes physical storage of data.
•Conceptual Level: Describes the structure of the entire database for the community of
users.
•External Level: Describes user views and specific applications.
•Data Independence:
•Logical Data Independence: Can change conceptual schema without altering
external views.
•Physical Data Independence: Can change physical storage without altering the
conceptual schema.

Database Languages and Interfaces
•Data Definition Language (DDL): Used to define schemas (e.g., CREATE TABLE).
•Data Manipulation Language (DML): Used for querying and modifying data (e.g., SELECT,
INSERT).
•Data Control Language (DCL): Used for permissions (e.g., GRANT, REVOKE).
•Transaction Control Language (TCL): Manages transactions (COMMIT, ROLLBACK).
•Interfaces:
•Graphical User Interfaces (GUI).
•Web-based interfaces.
•Application Programming Interfaces (APIs) like JDBC, ODBC.

The Database System Environment
1. DBMS Software: Core component that manages data.
2. Database: Actual stored data.
3. Database Schema: Metadata and definitions.
4. Query Processor: Handles SQL parsing, optimization, and execution.
5. Storage Manager: Manages data storage and retrieval.
6. Concurrency Control Manager: Ensures transaction integrity in multi-user
environments.
7. Recovery Manager: Maintains data consistency after crashes or failures.
8. Database Users and Applications: Interact with the system through defined interfaces.

Database Systems: Significance and Advantages
Database systems play a crucial role in modern information management. Here are some of the key
significance and advantages of database systems:
i. Data Organization and Storage:
• Database systems provide a structured approach to organizing and storing large volumes of
data efficiently.
• They offer a logical and physical structure that allows data to be stored in a consistent and
organized manner, making it easier to retrieve and manage.
ii. Data Integration and Centralization:
• Databases enable the integration of data from multiple sources into a central repository.
• This centralization eliminates data redundancy and inconsistency, as well as facilitates data
sharing and collaboration across different departments or systems within an organization.

iii. Data Consistency and Integrity:
• Database systems enforce data consistency and integrity through various mechanisms such
as data constraints, referential integrity, and data validation rules.
• These mechanisms ensure that data remains accurate and reliable, preventing
inconsistencies and errors that can arise in manual or decentralized data management
approaches.
iv. Data Security and Access Control:
Database systems offer robust security features to protect sensitive data. Access control
mechanisms can be implemented to restrict unauthorized access to the database, ensuring data
privacy and compliance with regulations.
Encryption techniques can be applied to safeguard data during transmission and storage.

v. Data Retrieval and Manipulation:
• Database systems provide powerful query languages, such as SQL, which enable users to
retrieve, update, and manipulate data effectively.
• These languages offer a standardized way to interact with the database, allowing users to
perform complex operations and retrieve specific subsets of data based on various criteria.
vi. Data Scalability and Performance:
• Database systems are designed to handle large-scale data and support concurrent access by
multiple users.
• They offer mechanisms for indexing, caching, and optimizing query execution, resulting in
efficient data retrieval and processing. This scalability and performance make database
systems suitable for handling ever-growing data volumes and demanding workloads.

There are several types of databases, each designed to cater to specific data storage and management requirements.
Here are some common types:
i. Relational Databases (RDBMS):
• Relational databases store and manage data in a structured manner using tables, with relationships
established between them.
• They are based on the relational model and use SQL for data retrieval and manipulation. Examples include
MySQL, Oracle Database, and Microsoft SQL Server.
ii. NoSQL Databases:
• NoSQL (Not Only SQL) databases are designed to handle unstructured and semi-structured data.
• They offer flexible schema designs, horizontal scalability, and high performance. NoSQL databases are
suitable for handling large-scale data and are commonly used in web applications, IoT, and big data
environments. Examples include MongoDB, Cassandra, and Redis.
Types of Databases

iii. Object-Oriented Databases (OODBMS):
• Object-oriented databases store and manage complex objects, including their attributes and behaviors.
• They allow for the direct representation of real-world objects and support inheritance and encapsulation.
iv. Hierarchical Databases:
Hierarchical databases organize data in a tree-like structure, with parent-child relationships between records.
They are suited for applications where data has a natural hierarchical organization, such as file systems or
organizational charts.
v. Network Databases:
Network databases store data using a network model, which allows for more complex relationships between
records compared to hierarchical databases.
Network databases were popular in the early days of computing and are still used in certain specialized
applications.
Types of Databases

vi. Cloud Databases:
• Cloud databases are hosted on cloud platforms and provide scalability, high availability, and
accessibility from anywhere.
• They are designed to handle distributed data and support various database models, such as
relational, NoSQL, and NewSQL databases. Examples include Amazon RDS, Google Cloud
Spanner, and Microsoft Azure Cosmos DB.

Limitation of file processing systems
File Processing system which were prevalent before the advent of database systems, have several
limitations that prompted the development and adoption of modern database management systems. Here
are some of the main limitations of file processing systems:
i. Data Redundancy and Inconsistency:
• In file processing systems, data is often duplicated across multiple files. This redundancy leads to
wasted storage space and increases the chances of data inconsistencies.
• If data is updated in one file but not in others, it can result in conflicting and inaccurate information.
ii. Data Isolation:
• In file processing systems, data is typically stored in separate files with little or no integration.
• It becomes challenging to access and retrieve data that is spread across multiple files, leading to data
isolation and difficulty in obtaining a comprehensive view of the data.

iii. Data Dependence and Program-Data Coupling:
• File processing systems have a high degree of dependence between programs and data files.
• Each program has to be designed to understand the structure and format of the files it interacts
with.
iv. Lack of Data Integrity and Security:
• File processing systems lack built-in mechanisms to enforce data integrity constraints and security
measures.
• It is up to individual programs to ensure data integrity and handle security concerns. This
decentralized approach makes it difficult to maintain consistent data across the system and secure
sensitive information effectively.

v. Lack of Data Independence:
• In file processing systems, changes to the structure or format of data files impact the programs
that access those files.
• This lack of data independence makes it challenging to modify or extend the system without
disrupting existing programs and requiring extensive code modifications.
vi. Poor Data Sharing and Concurrency Control:
• File processing systems often struggle with concurrent access and sharing of data.
• When multiple users or programs attempt to access and update the same file simultaneously, it
can result in data inconsistencies and conflicts. Ensuring proper concurrency control and data
sharing becomes complex and error-prone.

vii. Limited Data Querying and Reporting Capabilities:
• File processing systems lack standardized query languages and sophisticated querying capabilities.
• Retrieving specific subsets of data or generating complex reports requires writing custom programs, which
can be time-consuming and require deep knowledge of file structures and access methods.
viii. Difficulty in Data Backup and Recovery:
• File processing systems lack built-in mechanisms for easy data backup and recovery.
• Ensuring data durability and implementing efficient backup and recovery procedures is often a manual and
error-prone process.
ix. Lack of Scalability:
• File processing systems are not designed to handle large volumes of data efficiently.
• As the amount of data grows, performance issues arise, and it becomes challenging to scale the system to
accommodate increasing data and user demands.

Database Management System Environment
The DBMS (Database Management System) environment refers to the set of components, tools, and technologies that are
involved in managing and operating a database system. It encompasses both the software and hardware infrastructure required to
support the database and its operations. Here are some key components of the DBMS environment:
i. Database Server:
• The database server is the core component of the DBMS environment. It is responsible for storing, managing, and
retrieving data from the database.
• The server manages the physical storage of data, enforces data integrity rules, handles concurrency control, and provides
mechanisms for data access and manipulation.
ii. Database Engine:
• The database engine is the software component within the database server that handles the low-level operations of data
storage and retrieval.
• It interprets and executes queries, manages transactions, implements security mechanisms, and performs optimization tasks
to enhance query performance.

iii. Data Storage:
• The DBMS environment includes the storage infrastructure for the database. This can be disk-based storage or
more advanced options like solid-state drives (SSDs) or in-memory storage. The storage system provides the
physical space to store the database files and ensures data durability, reliability, and availability.
iv. Database Schema:
• The database schema defines the structure and organization of the database. It includes tables, columns,
relationships, constraints, and indexes.
• The schema acts as a blueprint for how data is stored and accessed within the database. It provides a logical
view of the data and ensures data consistency and integrity.
v. Query Language:
• A DBMS typically supports a query language that allows users and applications to interact with the database.
Structured Query Language (SQL) is the most commonly used query language in relational database systems.
• It provides a standardized syntax and commands for querying, inserting, updating, and deleting data.

vi. Database Administration Tools:
• The DBMS environment includes various tools for database administration tasks. These tools assist
database administrators in managing the database system, monitoring performance, configuring
security settings, and performing backups and recovery operations.
• Examples of such tools include graphical user interfaces (GUIs), command-line interfaces (CLIs),
and monitoring utilities.
vii. Application Interfaces:
• DBMS environments offer interfaces for applications to interact with the database.
• These interfaces provide programming APIs (Application Programming Interfaces) and drivers
that enable developers to connect their applications to the database and perform operations like
data retrieval, modification, and transaction management.

viii. Security and Access Controls:
• The DBMS environment includes mechanisms for securing the database and controlling access to
data.
• It allows administrators to define user roles and privileges, enforce authentication and authorization
policies, and implement encryption and auditing features to protect sensitive data.
ix. Backup and Recovery:
• A robust DBMS environment provides mechanisms for data backup and recovery.
• It includes tools and processes to create regular backups of the database, as well as methods to
restore data in case of system failures, data corruption, or human errors.

ix. Performance Optimization:
• The DBMS environment offers features for optimizing the performance of the database system.
• This includes query optimization techniques, indexing strategies, caching mechanisms, and
performance monitoring tools to identify and resolve performance bottlenecks.

Data Abstraction
 The database system contains intricate data structures and relations.
 The developers keep away the complex data from the user and remove the complications so that the user can
comfortably access data in the database and can only access the data they want, which is done with the help
of data abstraction.
 The main purpose of data abstraction is to hide irrelevant data and provide an abstract view of the data. With
the help of data abstraction, developers hide irrelevant data from the user and provide them the relevant data.
 By doing this, users can access the data without any hassle, and the system will also work efficiently.
 In DBMS, data abstraction is performed in layers which means there are levels of data abstraction in DBMS
that we will further study in this article.
 Based on these levels, the database management system is designed.

Levels of Data Abstractions in DBMS
In DBMS, there are three levels of data abstraction, which are as follows:
1. Physical or Internal Level:
 The physical or internal layer is the lowest level of data abstraction in the database management system.
 It is the layer that defines how data is actually stored in the database. It defines methods to access the
data in the database.

 It defines complex data structures in detail, so it is very complex to understand, which is why it is kept
hidden from the end user.
 Data Administrators (DBA) decide how to arrange data and where to store data.
 The Data Administrator (DBA) is the person whose role is to manage the data in the database at the
physical or internal level. There is a data center that securely stores the raw data in detail on hard drives
at this level.

2. Logical or Conceptual Level:
 The logical or conceptual level is the intermediate or next level of data abstraction. It explains what
data is going to be stored in the database and what the relationship is between them.
 It describes the structure of the entire data in the form of tables. The logical level or conceptual
level is less complex than the physical level. With the help of the logical level, Data Administrators
(DBA) abstract data from raw data present at the physical level.
3. View or External Level:
 View or External Level is the highest level of data abstraction.
 There are different views at this level that define the parts of the overall data of the database. This
level is for the end-user interaction; at this level, end users can access the data based on their
queries.

Advantages of data abstraction in DBMS
 Users can easily access the data based on their queries.
 It provides security to the data stored in the database.
 Database systems work efficiently because of data abstraction.

Data Independence
Data independence can be explained using the three-schema architecture.
Data independence refers characteristic of being able to modify the schema at one level of
the database system without altering the schema at the next higher level.
There are two types of data independence:
1. Logical Data Independence
 Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
 Logical data independence is used to separate the external level from the conceptual view.
 If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
 Logical data independence occurs at the user interface level.

2. Physical Data Independence
 Physical data independence can be defined as the capacity to change the internal schema without having to
change the conceptual schema.
 If we do any changes in the storage size of the database system server, then the Conceptual structure of the
database will not be affected.
 Physical data independence is used to separate conceptual levels from the internal levels.
 Physical data independence occurs at the logical interface level.
Fig: Data Independence

DBMS Architecture
 The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a
large number of PCs, web servers, database servers and other components that are connected with
networks.
 The client/server architecture consists of many PCs and a workstation which are connected via the
network.
 DBMS architecture depends upon how users are connected to the database to get their request done.
Types of DBMS Architecture
Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two types
like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture
 In this architecture, the database is directly available to the user. It means the user can directly sit on the
DBMS and uses it.
 Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for end
users.
 The 1-Tier architecture is used for development of the local application, where programmers can directly
communicate with the database for the quick response.
2-Tier Architecture
 The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the client end
can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC
are used.
 The user interfaces and application programs are run on the client-side.
 The server side is responsible to provide the functionalities like: query processing and transaction
management.
 To communicate with the DBMS, client-side application establishes a connection with the server side.

3-Tier Architecture
 The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
 The application on the client-end interacts with an application server which further
communicates with the database system.
 End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
 The 3-Tier architecture is used in case of large web application.

Functions of DBMS
A database management system (DBMS) serves as a software application that enables users to manage,
manipulate, and organize data stored in a database. It provides several functions and services to ensure
efficient data management and reliable data access. Here are the key functions of a DBMS:
i. Data Definition: DBMS allows users to define the database schema, which includes creating tables,
specifying attributes, defining relationships, and establishing constraints. It provides a data definition
language (DDL) to facilitate the creation, modification, and deletion of the database structure.
ii. Data Manipulation: DBMS enables users to manipulate the data stored in the database. It provides a
data manipulation language (DML) that includes commands for inserting, updating, deleting, and
querying data. Users can perform operations like selecting specific data, filtering, sorting, and
aggregating results.

iii. Data Storage and Retrieval: DBMS manages the storage of data in the database, handling aspects like data
organization, indexing, and storage optimization. It provides efficient mechanisms to retrieve data based on
user queries, utilizing query optimization techniques to improve performance.
iv. Data Integrity and Constraint Enforcement: DBMS ensures data integrity by enforcing constraints on the
database. It allows users to define constraints such as primary keys, foreign keys, unique constraints, and
check constraints to maintain the consistency and correctness of the data.
v. Data Security and Access Control: DBMS implements security mechanisms to protect data from
unauthorized access and ensure data confidentiality and privacy. It allows users to define access control
policies, manage user permissions and roles, and encrypt sensitive data to safeguard against unauthorized
usage.

vi. Transaction Management: DBMS manages database transactions, which are sets of operations
performed on the database as a single logical unit. It ensures the ACID properties (Atomicity,
Consistency, Isolation, Durability) of transactions, guaranteeing that they are executed reliably and
securely.
vii. Concurrency Control: DBMS handles concurrent access to the database by multiple users or
applications. It employs concurrency control mechanisms like locking, timestamping, or multiversion
concurrency control to prevent conflicts and maintain data consistency.
viii. Backup and Recovery: DBMS provides features for backup and recovery, allowing users to create
backup copies of the database to protect against data loss. It supports mechanisms for data
restoration in case of system failures, crashes, or human errors.

ix. Data Independence: DBMS promotes data independence, allowing changes to the database schema
without affecting the applications or programs that use the data. It enables modifications to the logical
and physical structure of the data while providing a consistent view to the users.
x. Data Dictionary Management: DBMS maintains a data dictionary or metadata repository that stores
information about the database structure, schema, constraints, and other details. It provides tools and
interfaces to manage and query the data dictionary, facilitating database administration and schema
evolution.

Summary
"Introduction to Database Systems" is a course that provides an overview of the
fundamental concepts and principles related to databases. It covers topics such as data
models, relational database management systems, entity-relationship modeling, SQL
querying, database design, indexing, query optimization, transaction management, and
database security. The course aims to familiarize students with the basics of how data is
organized, stored, and accessed in structured databases.

i. A Silberschatz, H Korth, S Sudarshan, “Database System and Concepts”, fifth Edition
McGraw-Hill , Rob, Coronel, “Database Systems”, Seventh Edition, Cengage
Learning.
ii. Abraham Silberschatz, Henry F. Korth, S. Sudarshan (2005), Database System
Concepts, 5th edition, McGraw-Hill, New Delhi,India.
iii. Peter Rob, Carlos Coronel (2009), Database Systems Design, Implementation and
Management, 7thedition.
Bibliography

Expected Questions
BAQ
Q.1 What is the difference between the relational model and the hierarchical model in
database systems?
SAQ
Q.2 Describe the ACID properties in the context of transaction management and explain why
they are important in ensuring reliable and consistent database operations.
LAQ
Q.3 What are the different types of Databases?

Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx

More Related Content

Similar to Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx

Recently uploaded

Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx