4. Files and File Systems
The file system permits users to create data collections, called files, with desirable
properties, such as:
Long-term existence:
Files are stored on disk or other secondary storage and do not disappear when a
user logs off.
Sharable between processes:
Files have names and can have associated access permissions that permit
controlled sharing.
Structure:
Depending on the file system, a file can:
Have an internal structure that is convenient for particular applications.
Be organized into a hierarchical or more complex structure to reflect the
relationships among files.
5. Files and File Systems
File system provides a collection of functions that can be performed on files. operations
include:
Create:
A new file is defined and positioned within the structure of files.
Delete:
A file is removed from the file structure and subsequently destroyed.
Open:
An existing file is declared to be “opened” by a process, allowing the process to perform functions
on the file.
Close:
The file is closed with respect to a process, so the process no longer may perform functions on
the file, until the process opens the file again.
Read:
A process reads all or a portion of the data in a file.
Write:
A process updates a file, either by
adding new data that expands the size of the file.
or by changing the values of existing data items in the file.
7. Field
The basic element of data.
Single value (such as an employee’s last name, a date, or the value of a
sensor reading).
Length (may be fixed length or variable length).
Data type (e.g., ASCII string, decimal).
In the case of variable length the field often consists of two or three subfields:
The actual value to be stored
The name of the field
In some cases, the length of the field
Contains
Characterized by
its
8. Record
A collection of related fields that can be treated as a unit by some
application program. ( For example, an employee record would contain
such fields as name, social security number, job classification, date of
hire )
Records may be of fixed length or variable length.
A record will be of variable length if
Some of its fields are of variable length, or
The number of fields may vary.
9. File
A file is a collection of similar records.
Files have file names and may be created and deleted.
Access control restrictions usually apply at the file level.
In a shared system, users and programs are granted or denied access to entire
files.
10. Database
Collection of related data.
The essential aspects of a database are that :
The relationships that exist among elements of data are explicit.
The database is designed for use by a number of different applications.
A database may contain all of the information related to an organization or a
project.
Consists of one or more types of files.
There is a separate database management system that is independent of
the operating system (may make use of some file management programs).
11. Operations performed on file
Retrieve_All:
Retrieve all the records of a file.
Required for an application that must process all of the information in
the file at one time.
This operation is often equated with the term sequential processing,
because all of the records are accessed in sequence.
Retrieve_One:
Requires the retrieval of just a single record.
Interactive, transaction-oriented applications need this operation.
Retrieve_Next:
Requires the retrieval of the record that is “next” in some logical
sequence to the most recently retrieved record.
12. Retrieve_Previous:
The record that is “previous” to the currently accessed record is
retrieved.
Insert_One:
Insert a new record into the file.
Delete_One:
Delete an existing record.
Update_One:
Retrieve a record, update one or more of its fields, and rewrite the
updated record back into the file.
Retrieve_Few:
Retrieve a number of records.
13. File Management Systems
Set of system software that provides services to users and applications
in the use of files.
The way a user or application may access files.
14. Objectives for a File Management
System
1.Meet the data management needs and requirements of the user.
2.Guarantee that the data in the file are valid.
3.Optimize performance.
4.Provide I/O support for a variety of storage device types.
5. Minimize or eliminate the potential for lost or destroyed data
6.Provide a standardized set of I/O interface routines to user processes.
7.Provide I/O support for multiple users.
15. Minimal set of requirements
1 • Should be able to create, delete, read, write, and modify files.
2 • May have controlled access to other users’ files.
3 • May control what types of accesses are allowed to the user’s files.
4 • Should be able to move data between files.
5
• Should be able to back up and recover the user’s files in case of
damage.
6
• Should be able to access his or her files by name rather than by
numeric identifier.
Each user:
16.
17. Device drivers
At the lowest level.
Communicate directly with peripheral devices.
Responsible for starting I/O operations.
Processing the completion of an I/O request.
considered to be part of the operating system.
For file operations, the typical devices controlled are disk and tape
drives.
18. Basic file system
Also referred to as the physical I/O level.
Primary interface with the environment outside of the computer system.
Deals with exchanging blocks of data that are exchanged with disk or
tape systems.
Concerned with the placement of blocks on the secondary storage
device.
Concerned with buffering blocks in main memory.
Does not understand the content of the data or the structure of the files
involved.
19. Basic I/O supervisor
Responsible for all file I/O initiation and termination.
At this level, control structures are maintained that deal with device I/O,
scheduling, and file status.
Selects the device on which file I/O is to be performed.
Concerned with scheduling disk and tape accesses to optimize performance.
I/O buffers are assigned and secondary memory is allocated at this level.
Part of the operating system.
21. Access method
Level of the file system closest to the user.
Provides a standard interface between applications and the file
systems and devices that hold the data.
Different access methods reflect different file structures and different
ways of accessing and processing the data.
22.
23. File organization and access
File organization to refer to the logical structuring of the records as
determined by the way in which they are accessed.
In choosing a file organization, several criteria are important:
Short access time
Ease of update
Economy of storage
Simple maintenance
Reliability
Priority of these criteria will depend on the applications that will use the
file.
24. File organizations types
Five of the
common
file
organizati
on are :
The
pile
The
sequenti
al file
The
indexed
sequenti
al file
The
indexed
file
The
direct, or
hashed,
file
25. The Pile
Least complicated form of file
organization.
Data are collected in the order
in which they arrive.
Each record consists of one
burst of data.
The purpose is simply to
accumulate the mass of data
and save it.
Record access is by exhaustive
search.
26. The Sequential
File
Most common form of file
structure.
A fixed format is used for
records.
The key field uniquely identifies
the record.
The records are stored in key
sequence:
Alphabetical order for a text key,
Numerical order for a numerical
key.
Typically used in batch
applications
(e.g., a billing or payroll
application)
27. The Indexed
Sequential File
Records are organized in
sequence based on a key field.
Two features are added:
An index to the file to support
random access.
an overflow file.
The index in this case is a
simple sequential file.
Greatly reduces the time
required to access a single
record.
To provide even greater
efficiency in access, multiple
levels of indexing can be used.
28. The Indexed File
Records are accessed only
through their indexes.
Variable-length records can be
employed.
An exhaustive index contains one
entry for every record in the main
file.
A partial index contains entries to
records where the field of interest
exists.
used mostly in applications where
timeliness of information is critical.
Examples are airline reservation
29. The Direct or
Hashed File
Access directly any block of a known
address.
Key field is required in each record.
Makes use of hashing on the key value.
Often used where:
Very rapid access is required.
Fixed length records are used
Records are always accessed one at a
time.
Examples are directories, pricing tables,
schedules, and name lists.
30. B-TREES
A balanced tree structure, with all branches of equal length.
The standard method of organizing indexes for databases.
Commonly used in OS file systems.
Provides for efficient searching, adding, and deleting of items.
31. A B-tree is a tree structure (no closed loops) with the following
characteristics (see Figure 12.4):
32. B-Tree
Characteristics
1. Every node has at most 2d - 1 keys and 2d
children or, equivalently, 2d pointers.
2. Every node, except for the root, has at least d
- 1 keys and d pointers.
3. The root has at least 1 key and 2 children.
4. All leaves appear on the same level and
contain no information. This is a logical
construct to terminate the tree; the actual
implementation may differ. For example, each
bottom-level node may contain keys
alternating with null pointers.
5. A nonleaf node with k pointers contains k - 1
keys.
A B-Tree is
characterized by its
minimum degree d and
satisfies the following
properties:
33. Search for a
key
1. The key you want is less then the
smallest key in this node. Take the
leftmost pointer down to the next level.
2. The key you want is greater than the
largest key in this node. Take the
rightmost pointer down to the next level.
3. The value of the key is between the
values of two adjacent keys in this node.
Take the pointer between these keys
down to the next level.
To search for a key, you start at the
root node.
If the key you want is in the node,
you’re done.
If not, you go down one level. There
are three cases:
1. The tree consists of a number of nodes and leaves.
2. Each node contains at least one key which uniquely identifies a file record, and more than one pointer to child nodes or leaves.
3. Each node is limited to the same number of maximum keys.
4. The keys in a node are stored in nondecreasing order. Each key has an associated child that is the root of a subtree containing all nodes with keys less than or equal to the key but greater than the preceding key. A node also has an additional rightmost child that is the root for a subtree containing all keys greater than any keys in the node. Thus, each node has one more pointer than keys.