2. Course Details
Textbook
◦ Database Systems Concepts Design Applications by S K
Singh, latest edition 2011
Reference
◦ Database System by Catherine Ricardo
◦ Data Management Systems by Raghu Ramakrishnan and
Johannes Gehrke
◦ Database Systems: Design, Implementation, and
Management By Carlos Coronel, Steven Morris, Peter
Rob, 10th edition 2012
◦ An Introduction to Database Systems By Date, 2006
◦ Introduction To ORACLE: SQL and PL/SQL, Student
Guide, Production 1.1, Volume 1.2.
2
3. Course Outline
(subject to minor changes)
Basics: Introduction to Database Systems, The Entity Relationship
Model, The Relational Model, Relational Algebra and Calculus
SQL: Queries, Programming, Triggers, Query By Example (QBE)
Data Storage and Indexing: File Organization and Indexes, Tree
Structured Indexing, Hash Based Indexing
Query Evaluation: External Sorting, Evaluation Of Relational
Operators, Introduction to Query Optimization, A Typical Relational
Query Optimizer
Database Design: Schema Refinement and Normal Forms, Physical
Database Design and Tuning, Security
Transaction Management: Transaction Management Overview,
Concurrency Control, Crash Recovery
Advanced topics: Parallel and Distributed Databases, Internet
Databases, Decision Support, Data Mining, Object Database Systems,
Spatial Data management, Deductive Databases 3
4. Introduction
In today’s competitive environment, data and its
effective management is the most critical business
objective of an organization
The success of an organization is now dependent on its
ability to acquire accurate, reliable and timely data
about its business or operation for effective decision
making process
Database system is a tool that simplifies the above
tasks of managing and extracting useful information,
analyses and guided the activities of an organization
It is the central repository of the data in organization
information system: maintain the data, support
organization function and help in decision making
4
5. What are Data?
Data may be defined as a fact that can be recorded and
have implicit meaning
Data are often viewed as the lowest level of abstraction
from which information and knowledge are derived.
Data can exist in a variety of forms -- as numbers or
text on pieces of paper, as bits and bytes stored in
electronic memory, or as facts stored in a person's mind.
Raw data refers to a collection of numbers, characters,
images or other outputs from devices that collect
information to convert physical quantities into symbols,
that are unprocessed.
5
6. What are Data?
Usually, there are many facts to describe something of
interest to us. (For example, employee data to calculate
payroll check, send company greetings, inform family
in case of emergency
6
7. 7
Data: Where can we find it?
Memories
Folders
Spreadsheets Paper piles
Lists
Filing Cabinets
And many more …
8. Information
Information is processes, organized or summarized data
It may be defined as a collection of related data that
when put together, communicate meaningful and useful
message to the recipient who use it to make decision
Data are processed to create the information, which is
meaningful to the recipient
It helps in giving warning signals before something
starts going wrong
It predicts the future with reasonable level of accuracy
and helps the organization to make the best decisions.
Database may contain either data or information or
both.
8
13. Data Item or Field
A data item is the smallest part of data that has meaning to
its user called field or data element
It is the occurrence of smallest unit of named data
It is represented in the database by a value
Example: Name, telephone number, bill amount, address
and so on.
Data items are the modules of the database
A data item may be used to construct other, more complex
structures
13
14. Records
A record is a collection of logically related fields or
data items, with each field having a fixed data type
A record consists of values for each field
Data items are grouped together to form a record.
recover or updated using programs.
14
15. Files
A file is a collection of related sequence of records
Fixed-length records: Every record in the file has exactly
the same size
Variable-length records: different records in the file has
different sizes.
15
17. File Oriented Systems
Computer based filing system were initially used for
scientific and engineering calculations
Manual Method of filing of an organization
◦ Hold internal and external correspondence relating to project
◦ File and folders were labeled and stored in cabinets under
lock for safety and security reasons
◦ Difficult to search specific entry in specific file and folder
◦ Work well for small data
◦ Report generation from manual file system could be slow
and cumbersome
17
18. File Oriented System
A file system is a method of storing and organizing
computer files and their data.
basically, it organizes these files into a database for
the storage, organization, manipulation, and recovery
by the computer's operating system.
Since it perform normal record keeping functions,
called data processing systems
File systems are used on data storage devices such as
a hard disks or CD-ROMs to maintain the physical
location of the files.
18
22. 22
File Systems
ASCII file
Accounts separated by new lines
Fields separated by #’s
Different files: account types, branches etc.
23. 23
File Systems
What’s the balance in Homer Simpson’s account?
A simple script
Scan through the accounts file
Look for the line containing “Homer Simpson”
Print out the balance
25. 25
Advantages of File System
Provides a useful historical perspective that how we
handle the data
Helps in overall understanding of design complexity
of the overall system
Understanding the problems and knowledge of
limitation in file based system helps in avoiding the
same problem when designing database system.
26. 26
Drawbacks of File System
Data Redundancy (or duplication)
◦ Decentralized approach adopted
◦ Duplication of information in different files (For example,
cust_id data in CUSTOMER and SALES file)
◦ Wasteful (more storage space, extra time and money, more
effort to keep data up to date)
27. 27
Drawbacks of File System
Data Inconsistency (or loss of data integrity)
◦ Multiple file formats, duplication of information in different
files (e.g name in one file is 15 characters, while in other
file is 10 characters)
◦ Various copies of the same data may be different
◦ Results in maintenance overhead and storage costs
◦ Serious degradation in the quality of information and also
the accuracy
28. 28
Drawbacks of File System
Difficulty in Accessing Data
◦ Need to write a new program to carry out each new task
Data Isolation
◦ Data scattered in various files - Difficult
Program Data Dependence
◦ A change in file structure requires change in the file
description (physical structure, storage of the data files and
record) in each program to confirm the new file structure
◦ Difficult to locate all files affected by it
◦ Time consuming and subject to error when making changes
29. 29
Drawbacks of File System
Poor data control
◦ Multiple names used by various departments due to
decentralized nature
◦ Lead to different meanings of the data field in different
context, same meaning for different fields, leads to poor data
control, and also confusion
Limited Data Sharing & Excessive Programming Effort
◦ Each application has its own private files
◦ Little opportunity to share data with other applications
◦ To obtain data from several incompatible files in separate
system will require a large programming effort
Inadequate data manipulation capabilities
◦ No connection between data in different files, so data
manipulation capability is limited
30. 30
Drawbacks of File System
Integrity Problems
◦ Integrity constraints (e.g. account balance > 0) become part
of program code
◦ Hard to add new constraints or change existing ones
Atomicity Problems
◦ Failures may leave database in an inconsistent state with
partial updates carried out
◦ E.g. transfer of funds from one account to another should
either complete or not happen at all
31. 31
Drawbacks of File System
Concurrent access by multiple users
◦ Concurrent accessed needed for performance
◦ Uncontrolled concurrent accesses can lead to
inconsistencies
For example, two people reading a balance and updating
it at the same time
Security Problems
◦ Access Control
Database systems offer solutions to all the above
problems
32. What is a Database?
A database consists of an organized collection of data
for one or more multiple uses.
An organized body of related information.
A collection of logically related data stored together
that is designed to meet the information needs of the
organization
A database is organized in such a way that a computer
program can quickly select desired pieces of data
32
33. Database Applications
Databases play a critical role in almost all areas
◦ Banking: all transactions
◦ Airline: reservation, schedules
◦ Universities: registration, grades
◦ Sales: customers, products, purchases
◦ Manufacturing: production, inventory, orders, supply
chain
◦ Human resources: employee records, salaries, tax
deductions
33
34. What is a Database?
A database can be of any size and of varying
complexity.
A software system that facilitates the creation and
maintenance and use of an electronic database
◦ For example, the list of names and addresses of friends
◦ The book catalog of a large library may contain half a
million records
◦ A database of much greater size and complexity is
maintained to keep track of the tax information filed by
taxpayers.
34
35. What is Database Management?
Database management is an approach to provide
simplistic access to information stored in databases.
Generalized software system for manipulating the
database
35
36. What is a Database Management System?
A DBMS is a collection of software programs to
enable users to create, maintain and utilize a database.
DBMS is a generalized software system for
manipulating databases
◦ Process of Defining (specifying the data types, structure and
constraint)
◦ Constructing (process of storing data on storage media)
◦ Manipulating (querying to retrieve specific data, updating to
reflect changes and generating reports from the data)
36
38. DBMS Components
Data Definition Language
◦ Allows user to define database, constraints on the data to be
stored in the database
Data Manipulation Language and query facility
◦ Allows user to insert, update delete and retrieve the data from
the database
◦ Provide general query facility through structured query
language (SQL)
Software For controlled access of data
◦ Provides controlled access to the database
◦ For example, unauthorized user trying to access the database
◦ Providing concurrency control system to allow shared access
of the database
38
40. What is a DBMS?
Functions of DBMS
◦ Insert records
◦ Delete records
◦ Update records
◦ Query records
◦ Add and Delete files from the database
In short, DBMS comprises of two main parts
◦ Data Management in the database
◦ User Management associated with the database
40
41. Database Approach
Database system consists of logically related data
Database approach represents the change in the way
end user data are stored, accessed and managed
Emphasizes the integration and sharing of data through
the organization
Eliminate problems related with data redundancy and
data control by supporting an integrated and centralized
data structure
41
44. What is a DBMSs?
Commercial DBMSs
Company Product
Oracle Oracle 8i, 9i, 10g,11i
IBM DB2, Universal Server (from System
R, System R*, Starburst) & Informix
Microsoft Access, SQL Server
Sybase Adaptive Server
Informix Dynamic Server
NCR Teradata
UC Brekeley’s INGRES,M PostgreSQL
44
45. Advantages of DBMS
Minimal Data Redundancy
◦ Centralized database and control of data
◦ Eliminates extra processing to trace the required data
◦ Storage requirement also reduced
◦ If duplicate data exists, DBMS is aware of it and ensure
multiple copies are consistent
Program Data Independence
◦ Separation of data description from the application
programs
◦ Change in the data description does not affect the
application program that process the data
◦ Allow change at one level of the database without affecting
other levels
45
46. Advantages of DBMS
Efficient Data Access
◦ Utilize a sophisticated techniques to store and retrieve data
efficiently
Improved Data Sharing
◦ Centralized repository of data belonging to entire
organization (For example, university data)
◦ Can be shared by all authorized users
◦ New application program can be developed on the existing
data in the database to share the same data and add only that
data that is not currently stored, rather having to define all
data requirements again
46
47. Advantages of DBMS
Improved Data Consistency
◦ Inconsistency is the corollary to redundancy
◦ DBMS ensures that any change made to either of the two
entries in the database is automatically applied to the other
one as well, known as propagating updates
Improved Data Integrity
◦ Ensures that the data is accurate and consistent
◦ Rules that the database should not violate
◦ Centralized control of the data in the database system
ensures that adequate checks are incorporated in DBMS to
avoid data integrity problem
◦ For example, months between the range 01 and 12, not
allowed to transfer money less than specific amount
47
48. Advantages of DBMS
Improved Security
◦ Protection of database from unauthorized user
◦ Can define user name and passwords to authorize user, and
may be restricted for each type of access
◦ Different levels of security could be implemented for
various types of data and operations
Increased Productivity of Application Development
◦ Provide many of the standard functions, such as forms and
report generators to automate some of the activities of the
database design
◦ Simplify the development of the database applications
48
49. Advantages of DBMS
Enforcement of Standards
Economy of Scale
Balance of Conflicting Requirements
Improved Data Accessibility
Improved Responsiveness
Increased Concurrency
Reduced Program Maintenance
Improved Backup and Recovery Services
Improved Data Quality
49
50. Disadvantages of DBMS
Increased Complexity
◦ Multi-user DBMS becomes an extremely complex piece of
software
◦ Necessary to understand the whole design to take advantage
of it
◦ Failure to understand, results in bad design decisions
Requirement of New and Specialized Manpower
◦ Need to hire, train and retrain manpower on regular basis to
design and implement databases
◦ Need to maintain specialized skilled manpower
Large Size of DBMS
◦ Requires large amount of memory to run efficiently due to
large complexity and wide functionality
50
51. Disadvantages of DBMS
Increased Installation & Management Cost
◦ Require trained manpower to install and operate DBMS,
also requires upgrade to the hardware, software and data
communication system
◦ Substantial training is required on ongoing basis to keep up
with new releases and upgrades
Conversion Cost
◦ From legacy system to modern DBMS environment
◦ It includes cost of DBMS, hardware, cost of employing
specialists
Need for Explicit Backup & Recovery
◦ Comprehensive procedure is required for the backup copies of
data and restoring a database when damage occurs
51
53. References
Chapter 1, Database Systems, S K Singh
Chapter 1, Database System Concepts, Silberschatz, Korth, Sudarshan
Chapter 1, Database Management Systems, by Ramakrishnan and
Gehrke
Course material from:
◦ Introduction to database systems – Duke University
◦ Database Systems – MCS Fall 2009
53