The document provides an introduction to advanced database systems, including basic definitions such as database, data, information, metadata, and database management system. It describes the typical components and functionality of a database management system, including data models, data manipulation languages, data definition languages, storage management, transaction management, and more. The document also discusses database users, advantages of databases, SQL, the query processor, and database architecture types.
2. Basic Definitions
Database: Organized collection of logically
related data.
Data: Known facts that can be recorded and
have an implicit meaning.
Structured: numbers, text, dates
Unstructured: images, video, documents
Information: data processed to increase
knowledge in the person using the data
Metadata: data that describes the properties
and context of user data.
Monday, July 17, 2023
Advanced Database Systems 2
3. Basic Definitions
Database Management System (DBMS): A
software package/ system to facilitate the
creation and maintenance of a computerized
database.
Database System: The DBMS software
together with the data itself. Sometimes, the
applications are also included.
Monday, July 17, 2023
Advanced Database Systems 3
4. Why study database management?
Critical to business, government, science, culture, society
Determines success of many corporations (even their
existence)
Many tech companies built on data management (Google,
Amazon, Yahoo!, Facebook, …)
Other companies offer database products (Microsoft, IBM,
Oracle)
Database systems span major areas of computer science
Operating systems (file, memory, process management)
Theory (languages, algorithms, complexity)
Artificial Intelligence (knowledge-based systems, logic, search)
Software Engineering (application development)
Data structures (trees, hashtables)
Monday, July 17, 2023
Advanced Database Systems 4
5. Regularly Structured Data
Sets the structure once (e.g., table attributes) and then
has many instances (records) that use that structure
Examples of regularly structured data
Employee, payroll, bank account, Data captured on web
forms
DBMS or RDBMS mainly designed to store, manage,
and retrieve structured data
Examples of unstructured (loosely or “semistructured”)
data
Documents, video, audio, images, maps, …
Managed mainly by content management and
information retrieval systems
Monday, July 17, 2023
Advanced Database Systems 5
6. File Management Systems
A File is un-interpreted, unstructured collection
of information
File operations: delete, catalog, create, rename,
open, close, read, write, find, …
Access methods: Algorithms to implement
operations along with internal file organization
Examples: File of Customers, File of Students;
Access method: implementation of a set of
operations on a file of students or customers.
Monday, July 17, 2023
Advanced Database Systems 6
7. File Management System Problems
Data redundancy and inconsistency
Multiple file formats, duplication of information in
different files
Difficulty in accessing data
Need to write a new program to carry out each new task
Data isolation — multiple files and formats
Integrity problems
Integrity constraints (e.g., account balance > 0) become
“buried” in program code rather than being stated
explicitly
Hard to add new constraints or change existing ones
Monday, July 17, 2023
Advanced Database Systems 7
8. File Management System Problems
Atomicity of updates
Failures may leave database in an inconsistent state with
partial updates carried out
Example: Transfer of funds from one account to another
should either complete or not happen at all
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the same
time
Security problems
Hard to provide user access to some, but not all, data
Monday, July 17, 2023
Advanced Database Systems 8
9. Problems with Data Redundancy
Waste of space to have duplicate data
Causes more maintenance headaches
The biggest problem:
o Data changes in one file could cause
inconsistencies
o Compromises in data integrity
9 Monday, July 17, 2023
Advanced Database Systems 9
10. SOLUTION:
The DATABASE Approach
Central repository of shared data
Data is managed by a controlling agent
Stored in a standardized, convenient form
10
Requires a Database Management System (DBMS)
Order Filing
System
Invoicing
System
Payroll
System
DBMS
Central database
Contains employee,
order, inventory,
pricing, and
customer data
Monday, July 17, 2023
Advanced Database Systems 10
11. Database Management System (DBMS)
DBMS contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and efficient to use
Database Applications:
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Monday, July 17, 2023
Advanced Database Systems 11
12. Typical DBMS Functionality
Define a database : in terms of data types,
structures and constraints
Construct or Load the Database on a secondary
storage medium
Manipulating the database : querying,
generating reports, insertions, deletions and
modifications to its content
Concurrent Processing and Sharing by a set of
users and programs – yet, keeping all data valid
and consistent.
Monday, July 17, 2023
Advanced Database Systems 12
13. Advantages of Using Databases
Controlling redundancy in data storage and in
development and maintenance efforts.
Sharing of data among multiple users.
Restricting unauthorized access to data.
Providing persistent storage for program
Objects (in Object-oriented DBMS’s)
Providing Storage Structures for efficient Query
Processing
Monday, July 17, 2023
Advanced Database Systems 13
14. Advantages of Using Databases
Providing backup and recovery services.
Providing multiple interfaces to different classes
of users.
Representing complex relationships among
data.
Enforcing integrity constraints on the database.
Drawing Inferences and Actions using rules
Monday, July 17, 2023
Advanced Database Systems 14
15. Database Users
End users
Use the database system to achieve some goal. They use
the data for queries, reports and may update the database
content.
Application developers
Write software to allow end users to interface with the
database system
Database systems programmer
Writes the database software itself
They define the content, the structure, the constraints, and
functions or transactions against the database.
They must communicate with the end-users and understand
their needs.
Monday, July 17, 2023
Advanced Database Systems 15
16. DB Users: Database Administrator
Coordinates all the activities of the database system;
has a good understanding of the enterprise’s
information resources and needs.
Database administrator's duties include:
Schema definition
Storage structure and access method definition
Schema and physical organization modification
Granting user authority to access the database
Specifying integrity constraints
Acting as liaison with users
Monitoring performance and responding to changes in
requirements
Monday, July 17, 2023
Advanced Database Systems 16
17. Data Models
A collection of tools for describing
Data
Data relationships
Data semantics
Data constraints
Relational model: Information is stored as tuples or
records in relations or tables
Entity-Relationship data model (mainly for database
design)
Object-based data models (Object-oriented and
Object-relational)
Semi-structured data model (XML)
Monday, July 17, 2023
Advanced Database Systems 17
18. Data Manipulation Language (DML)
Language for accessing and manipulating the data
organized by the appropriate data model
DML also known as query language
Two classes of languages
Procedural – user specifies what data is required and
how to get the data
Declarative (nonprocedural) – user specifies what data
is required without specifying how to get the data
SQL is the most widely used query language
SQL Statements: SELECT, INSERT, UPDATE,
DELETE, MERGE
Monday, July 17, 2023
Advanced Database Systems 18
19. Data Definition Language (DDL)
Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
SQL Statements include:
CREATE
ALTER
DROP
RENAME
TRUNCATE
COMMENT
Monday, July 17, 2023
Advanced Database Systems 19
20. Data Definition Language (DDL)
DDL compiler generates a set of table templates
stored in a data dictionary
Data dictionary contains metadata (i.e., data about
data). It stores information about the database
itself
The dictionary holds
Descriptions of database objects (tables, users,
rules, views, indexes,…)
Information about who is using which data
(locks)
Schemas and mappings
Monday, July 17, 2023
Advanced Database Systems 20
21. SQL
SQL: widely used non-procedural language
Example: Find the name of the instructor with ID 22222
select name
from instructor
where instructor.ID = ‘22222’
Example: Find the ID and building of instructors in the Physics
dept.
select instructor.ID, department.building
from instructor, department
where instructor.dept_name = department.dept_name and
department.dept_name = ‘Physics’
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g., ODBC/JDBC) which allow
SQL queries to be sent to a database
Monday, July 17, 2023
Advanced Database Systems 21
22. Query Processor
Compiler – verifies whether a program or
query is written in accordance with DDL and
DML rules
Optimizer – Finds the most effective way to
access the required data and supply it in a
user requested form. Monitors the query
execution and modifies a query evaluation
plan if necessary
Monday, July 17, 2023
Advanced Database Systems 22
23. Transaction Management
A transaction is a collection of operations that
performs a single logical function in a database
application
Transaction-management component
ensures that the database remains in a consistent
(correct) state despite system failures (e.g., power
failures and operating system crashes) and
transaction failures.
Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
Monday, July 17, 2023
Advanced Database Systems 23
24. Storage Management
Storage manager is a program module that
provides the interface between the low-level
data stored in the database and the application
programs and queries submitted to the system.
The storage manager is responsible for the
following tasks:
interaction with the file manager
efficient storing, retrieving and updating of
data.
Monday, July 17, 2023
Advanced Database Systems 24
25. File Manager
File Manager is responsible for mapping
logical database units (objects, relations,
etc.) into a set of low level files.
It is responsible for maintenance of files and
indexes on them. It should be able to create
and destroy index and collect unused
storage space to eliminate any unneeded
gaps on disks.
Monday, July 17, 2023
Advanced Database Systems 25
26. Buffer Manager
Buffer Manager is responsible for the allocation
and maintenance buffer space in memory to
facilitate processing database data by several
concurrent applications.
Buffer Manager decides when to load data from
a buffer to a database or discard the data and
under what conditions a new data should be put
into a buffer
Monday, July 17, 2023
Advanced Database Systems 26
27. Database Architecture
The architecture of a database system is
greatly influenced by the underlying computer
system on which the database is running:
Centralized
Client-server
Parallel (multi-processor)
Distributed
Monday, July 17, 2023
Advanced Database Systems 27