1
File-System Interface(Galvin Notes, 9th Ed.)
Chapter 11: File-SystemInterface
 FILE CONCEPT
 File Attributes (Name, Identifier, Type, Location, Size, Protection, Time & Date, User ID)
 File Operations
 File Types
 File Structure
 Internal File Structure
 ACCESS METHODS
 Sequential Access
 Direct Access
 Other Access Methods
 DIRECTORY STRUCTURE
 Storage Structure
 Directory Overview
 Single-Level Directory
 Two-Level Directory
 Tree-Structured Directories
 Acyclic-Graph Directories
 General Graph Directory
 FILE-SYSTEM MOUNTING
 FILE SHARING
 Multiple Users
 Remote File Systems (The Client-Server Model, Distributed Information Systems, Failure Modes)
 Consistency Semantics (UNIX Semantics, Session Semantics, Immutable-Shared-Files Semantics)
 PROTECTION
 Types of Access
 Access Control
 Other Protection Approaches and Issues
Content
FILE CONCEPT
File Attributes
Different OSes keeptrack ofdifferent file attributes, includingName, Identifier (e.g. inode number), Type (Text, executable, other binary, etc.),
Location (E.g., Hard drive), Size, Protection, Time & Date, User ID. Some systems give special significance to names, and particularlyextensions (.exe,
.txt, etc.), and some do not. Some extensions may be of significance to the OS (.exe), and others only to certain application s (.jpg).
File Operations
 The file ADT supports manycommonoperations:Creating a file, Writing a file, Readinga file, Repositioning withina file, Deleting a file,
Truncating a file.
 Information about currently open files is stored in an open file table, containing for example:
o File pointer - records the current position in the file, for the next read or write access.
2
File-System Interface(Galvin Notes, 9th Ed.)
o File-open count - How manytimes has the current file beenopened(simultaneouslybydifferent processes)andnot yet closed?
When this counter reacheszerothe file canbe removed from the
table.
o Disk location of the file.
o Access rights
 Some systems provide support for file locking.
o A shared lock is for readingonly.
o An exclusive lock is for writing as well as reading.
o An advisory lock is informationalonly, andnot enforced. (A "Keep
Out" sign, whichmaybe ignored.)
o A mandatory lock is enforced. (A trulylockeddoor.) UNIXused
advisorylocks, andWindows uses mandatorylocks.
File Types
 Windows (andsome other systems) use specialfile extensions to indicate the
type of eachfile. Macintoshstores a creator attribute for eachfile, according to
the program that first createdit with the create() system call. Macintosh stores
a creator attribute for each file, according to the program that first created it
with the create() system call.
File Structure
 Some files containaninternal structure, whichmayor maynot be knownto the OS. For the OS to support particular file formats increases
the size and complexity of the OS.
 UNIXtreats all files as sequences of bytes, withnofurther considerationof the internal structure. (With the exception of executable
binary programs, which it must know how to load and find the first executable stateme nt, etc.)
 Macintosh files have two forks - a resource fork, and a datafork. The resource forkcontains informationrelatingto the UI, such as icons
and button images, and can be modified independently of the data fork, which contains the code or data as appropriate.
Internal File Structure
 Diskfiles are accessed in units of physical blocks, typically512 bytes or some power-of-twomultiple thereof. (Larger physical disks use
larger block sizes, to keep the range of block numbers within the range of a 32-bit integer.)
 Internallyfiles are organizedinunits oflogical units, which maybe as small as a single byte, or maybe a larger size corresponding to some
data record or structure size. The number of logical units which fit into one physical block determines its packing, and has animpact on the
amount of internal fragmentation(wasted space) that occurs. As a general rule, half a physicalblockis wastedfor eachfile, and the larger
the block sizes the more space is lost to internal fragmentation.
ACCESS METHODS
 Sequential Access: A sequentialaccessfile emulates magnetic tape
operation, andgenerallysupports a few operations: a)readnext - read
a record and advance the tape to the next position. b) write next -
write a record andadvance the tape to the next position. c) rewind d)
skipn records - Mayor maynot be supported. N may be limited to
positive numbers, or may be limited to +/- 1.
 Direct Access: Jump to anyrecord andread that record. Operations
supportedinclude: read n - readrecord number n. (Note an argument
is nowrequired.) write n - write recordnumber n. (Note an argument
is now required.) jump to record n - could be 0 or the end of file.
Query current record - used to return back to this record later.
Sequential access can be easily emulated using direct access. The
inverse is complicated and inefficient.
 Other Access Methods: An indexed access scheme canbe easilybuilt
on top of a direct accesssystem. Verylarge files mayrequire a multi-tieredindexingscheme, i.e. indexes of indexes. (Lot of cool and
relevant content is there in the book for all chapters)
DIRECTORY AND DISK STRUCTURE
 Storage Structure: A disk can be used in its entirety for a file system.
Alternativelya physical diskcanbe broken upinto multiple partitions, slices,
or mini-disks, each of which becomes a virtual disk and can have its own
filesystem. (or be usedfor raw storage, swapspace, etc.) Or, multiple physical
disks can be combinedintoone volume, i.e. a larger virtual disk, with its own
filesystem spanning the physical disks.
 Directory Overview: Directoryoperations to be supported include: a) Search
for a file, b)Create a file (addto the directory) C) Delete a file (erase from the
3
File-System Interface(Galvin Notes, 9th Ed.)
directory) d) List a directory(possiblyorderedin different ways) e)Rename a file (maychange sorting order) f) Traverse the file system.
 Single-Level Directory: Simple to implement, but each file must have a unique name.
 Two-Level Directory: Each user gets their own directoryspace. File names
only need to be unique within a given user's directory. A master file
directory is used to keep track of each users directory, and must be
maintained when users are added to or removed from the system. A
separate directory is generally needed for system (executable) files.
Systems mayor maynot allowusers to access other directories besides
their ownIf access to other directories is allowed, then provision must be made to specifythe directorybeing accessed. Ifaccessis denied,
then special considerationmust be made for users to run programs locatedinsystemdirectories. A searchpath is the list of directories in
which to search for executable programs, and can be set uniquely for each user.
 Tree-Structured Directories: This is an obvious extension to
the two-tiereddirectorystructure. Eachuser / process has
the concept of a current directoryfrom which all (relative)
searches take place. Files may be accessed using either
absolute pathnames (relative to the root of the tree) or
relative pathnames (relative to the current directory.)
Directories are storedthe same as any other file in the
system, except there is a bit that identifies them as
directories, andtheyhave some special structure that the
OS understands.
 Acyclic-Graph Directories: When the same files needto be accessed in more thanone place inthe directorystructure (e.g. because they
are being shared by more than one user / process), it can be useful to provide an acyclic-graph structure.
UNIXprovidestwo types of links for implementing the acyclic-graph structure. A hardlink (usuallyjust called a link) involves multiple
directoryentries that bothrefer to the same file. Hardlinks are onlyvalid for ordinaryfiles in the same filesystem. A symbolic link, that
involves a special file, containing information about where to find the linkedfile. Symbolic l inks maybe used to link directories and/or files
in other filesystems, as well as ordinary files in the current filesystem. Windows only supports symbolic links, termed short cuts.
Hard links require a reference count, or linkcount for each file, keeping track of howmanydirectory entries are currently referring to
this file. Whenever one ofthe references is removed the link count is reduced, and whenit reaches zero, the diskspace can be reclaimed.
 General-Graph Directory: If cycles are allowedinthe graphs, thenseveralproblems canarise: Search algorithms cango intoinfinite loops.
One solutionis to not followlinks in searchalgorithms. (Or not to follow symbolic links, and to only allow symbolic links to refer to
directories.) Sub-treescanbecome disconnectedfromthe rest of the tree and still not have their reference counts reduced to zero.
Periodic garbage collection is required to detect and resolve thisproblem. (chkdsk in DOS and fsck in UNIX search for these problems,
among others, eventhoughcycles are not supposedto be allowedineither system. Disconnecteddiskblocks that are not marked as free
are added back to the file systems with made -up file names, and can usually be safely deleted.). Refer Figure 11.3
FILE SYSTEM MOUNTING
The basic idea behind mounting file systems is to combine multiple file
systems intoone large tree structure. The mount command is given a
filesystem to mount anda mount point (directory) on which to attach it.
Once a file system is mountedontoa mount point, anyfurther references
to that directoryactuallyrefer to the root of the mountedfile system. Any
files (or sub-directories)that hadbeenstored inthe mount point directory
prior to mounting the newfilesystem are now hidden by the mounted
filesystem, and are no longer available. For this reason some systems only
allow mounting onto empty directories.
Filesystems canonlybe mounted by root, unless root has previously
4
File-System Interface(Galvin Notes, 9th Ed.)
configured certainfilesystems to be mountable ontocertainpre-determinedmount points. (E.g. root mayallow users to mount floppyfilesystems to
/mnt or something like it.) Anyone canrunthe mount commandto see what filesystems are currentlymounted. Filesystems mayb e mounted read-
only, or have other restrictions imposed.
The traditional Windows OS runs anextendedtwo-tier directorystructure, where the first tier of the structure separatesvolumesbydrive letters,
and a tree structure is implemented belowthat level. Macintoshruns a similar system, where each new volume that is found is automatically
mountedandaddedto the desktopwhenit is found. More recent Windows systems allow filesystems to be mounted to any directo ry in the
filesystem, much like UNIX.
FILE SHARING
 Multiple Users: On a multi-user system, more informationneeds to be stored for eachfile: The owner (user)whoowns the file, and who
can control its access. The groupof other user IDs that mayhave some special access to the file. What access rights are afforded to the
owner (User), the Group, andto the rest of the world (the universe, a.k.a. Others.) Some systems have more complicated acces s control,
allowing or denying specific accesses to specifically named users or groups.
 Remote File Systems: The advent ofthe Internet introduces issuesfor accessing files storedonremote computers The original methodwas
ftp, allowing individual files to be transported across systems as needed. Ftp can be either account and password controlled, or
anonymous, not requiring anyuser name or password. Various forms of distributedfile systems allow remote filesystems to be mounted
onto a local directorystructure, andaccessed using normal file access commands. (The actual filesare still transported across the network
as needed, possiblyusingftp as the underlying transport mechanism.) The WWW has made it easyonce again to access files on remote
systems without mounting their filesystems, generally using (anonymous) ftp as the underlying file transport mechanism.
 The Client-Server Model: When one computer system remotelymounts a filesystem that is physically located on another system, the
systemwhichphysicallyowns the filesacts as a server, andthe system whichmounts them is the client. User IDs and group I Ds must be
consistent across bothsystems for the systemto work properly. (I.e. this is most applicable across multiple computers manag ed by the
same organization, shared bya commongroupof users.) The same computer can be both a client and a server. (E.g. cross-linked file
systems.). The NFS (Network File System) is a classic example of such a system.
 Distributed Information Systems: The DomainName System, DNS, provides for a unique naming system across all of the Internet.
Domainnames are maintainedby the Network Information System, NIS. Microsoft's Common Internet File System, CIFS, establishes a
network login for each user ona networked system withsharedfile access. Older Windows systems useddomains, andn ewer systems (XP,
2000), use active directories. User names must match acrossthe network for thissystemto be valid. A newer approach is the Lightweight
Directory-Access Protocol, LDAP, which provides a secure single sign-onfor all users to accessall resources on a network. This is a secure
systemwhichis gaininginpopularity, andwhich has the maintenance advantage of combining authorization information in one central
location.
 Consistency Semantics: Consistency Semantics dealswith the consistencybetweenthe views of sharedfileson a networkedsystem. When
one user changes the file, when do other users see the changes?
PROTECTION
Files must be kept safe for reliability(against accidental damage), and protection (against deliberate malicious access.) The former is usually
managed with backup copies. This section discusses the latter.
 Types of Access: The following low-level operations are often controlled:
o Read - View the contents of the file
o Write - Change the contents of the file.
o Execute - Load the file onto the CPU and follow the instructions contained therein.
o Append - Add to the end of an existing file.
o Delete - Remove a file from the system.
o List -View the name and other attributes of files on the system.
Higher-level operations, such as copy, can generally be performed through combinations of the above.
 Access Control: One approach is to have complicated Access Control Lists, ACL, which specifyexactlywhat access is allowedor denied for
specific users or groups. The AFS usesthis system for distributed access. Control is very finely adjustable, but may be complicated,
particularlywhenthe specific users involved are unknown. (AFSallows some wild cards, so for example all users on a certain remote
system may be trusted, or a given username may be trusted when accessing from any remote system.)
UNIXuses a set of 9 access control bits, in three groups of three. These correspondto R, W, and Xpermissions for each of the Owner,
Group, and Others. (See"manchmod" for full details.) The RWXbits control the following privileges for ordinary files and directories:
bit Files Directories
R
Read (view) file
contents.
Read directory contents. Required to get a listing of the directory.
W
Write (change)
file contents.
Change directory contents. Required to create or delete files.
X
Execute file
contents as a
program.
Access detailed directory information. Required to get a long listing, or to access any specific
file in the directory. Note that if a user has X but not R permissions on a directory, they can still
access specific files, but only if they already know the name of the file they are trying to access.
5
File-System Interface(Galvin Notes, 9th Ed.)
In addition there are some special bits that canalso be
applied:The set user ID (SUID)bit and/or the set group
ID (SGID) bits appliedto executable files temporarily
change the identity of whoever runs the program to
match that of the owner / group of the executable
program. Thisallows users running specific programs
to have access to files (while running that program) to
which theywouldnormallybe unable to access. Setting
of these twobits is usuallyrestricted to root, andmust
be done with caution, as it introduces a potential
security leak.
Windows adjusts files access througha simple GUI.
 Other Protection Approaches and Issues:
o Older systems which didnot originallyhave multi-user file access permissions (DOS and older versions of Mac) must now be
retrofitted if they are to share files on a network.
o Access to a file requires access to all the files along its pathas well. Ina cyclic directorystructure, users mayhave different access
to the same file accessed through different paths.
o Sometimesjust the knowledge of the existence ofa file ofa certain name is a security(or privacy) concern. Hence the distinction
between the R and X bits on UNIX directories.
SUMMARY
 A file is an abstract data type definedandimplemented bythe operating system. It is a sequence of logicalrecords. A logicalrecordmaybe
a byte, a line (of fixedor variable length), or a more complex data item. The operatingsystemmayspecificallysupport various recordtypes
or may leave that support to the application program.
 The major taskfor the operating systemis to mapthe logical file concept ontophysical storage devices suchas magnetic diskor tape. Since
the physical record size ofthe device maynot be the same as the logicalrecordsize, it may be necessary to orde r logical records into
physical records. Again, this task may be supported by the operating system or left for the application program.
 Each device ina file systemkeeps a volume table ofcontents or a device directorylistingthe locationof the files o nthe device. Inaddition,
it is useful to create directories to allow files to be organized. A single-level directoryina multiuser systemcausesnamingproblems, since
each file must have a unique name. A two-level directorysolves this problem bycreating a separate directory for each user’s files. The
directorylists the filesbyname andincludesthe file’s locationonthe disk, length, type, owner, time of creation, time o f last use, and so
on.
 The natural generalizationof a two-level directory is a tree-structured directory. A tree-structured directory allows a user to create
subdirectories to organize files. Acyclic-graphdirectorystructures enable users to share subdirectories andfiles but complicate searching
and deletion. A general graphstructure allows complete flexibilityinthe sharing offiles anddirectories but sometimes requires garbage
collection to recover unused disk space.
 Disks are segmentedintoone or more volumes, eachcontaining a file systemor left “raw.” File systems mayb e mounted into the system’s
namingstructures to make them available. The naming scheme variesbyoperating system. Once mounted, the files within the vo lume are
available for use. File systems may be unmounted to disable access or for maintenance.
 File sharing depends onthe semantics providedbythe system. Files mayhave multiple readers, multiple writers, or limits on shari ng.
Distributed file systems allowclient hosts to mount volumes or directories fromservers, as longas they can access each oth er across a
network. Remote file systems present challenges in reliability, performance, andsecurity. Distributed informationsystems maintain user,
host, and access information so that clients and servers can share state information to manage use and acces s.
 Since files are the maininformation-storage mechanism inmost computer systems, file protection is needed. Access to files can be
controlledseparatelyfor eachtype ofaccess—read, write, execute, append, delete, list directory, and so on. File prote ction can be
provided by access lists, passwords, or other techniques.

File system interfacefinal

  • 1.
    1 File-System Interface(Galvin Notes,9th Ed.) Chapter 11: File-SystemInterface  FILE CONCEPT  File Attributes (Name, Identifier, Type, Location, Size, Protection, Time & Date, User ID)  File Operations  File Types  File Structure  Internal File Structure  ACCESS METHODS  Sequential Access  Direct Access  Other Access Methods  DIRECTORY STRUCTURE  Storage Structure  Directory Overview  Single-Level Directory  Two-Level Directory  Tree-Structured Directories  Acyclic-Graph Directories  General Graph Directory  FILE-SYSTEM MOUNTING  FILE SHARING  Multiple Users  Remote File Systems (The Client-Server Model, Distributed Information Systems, Failure Modes)  Consistency Semantics (UNIX Semantics, Session Semantics, Immutable-Shared-Files Semantics)  PROTECTION  Types of Access  Access Control  Other Protection Approaches and Issues Content FILE CONCEPT File Attributes Different OSes keeptrack ofdifferent file attributes, includingName, Identifier (e.g. inode number), Type (Text, executable, other binary, etc.), Location (E.g., Hard drive), Size, Protection, Time & Date, User ID. Some systems give special significance to names, and particularlyextensions (.exe, .txt, etc.), and some do not. Some extensions may be of significance to the OS (.exe), and others only to certain application s (.jpg). File Operations  The file ADT supports manycommonoperations:Creating a file, Writing a file, Readinga file, Repositioning withina file, Deleting a file, Truncating a file.  Information about currently open files is stored in an open file table, containing for example: o File pointer - records the current position in the file, for the next read or write access.
  • 2.
    2 File-System Interface(Galvin Notes,9th Ed.) o File-open count - How manytimes has the current file beenopened(simultaneouslybydifferent processes)andnot yet closed? When this counter reacheszerothe file canbe removed from the table. o Disk location of the file. o Access rights  Some systems provide support for file locking. o A shared lock is for readingonly. o An exclusive lock is for writing as well as reading. o An advisory lock is informationalonly, andnot enforced. (A "Keep Out" sign, whichmaybe ignored.) o A mandatory lock is enforced. (A trulylockeddoor.) UNIXused advisorylocks, andWindows uses mandatorylocks. File Types  Windows (andsome other systems) use specialfile extensions to indicate the type of eachfile. Macintoshstores a creator attribute for eachfile, according to the program that first createdit with the create() system call. Macintosh stores a creator attribute for each file, according to the program that first created it with the create() system call. File Structure  Some files containaninternal structure, whichmayor maynot be knownto the OS. For the OS to support particular file formats increases the size and complexity of the OS.  UNIXtreats all files as sequences of bytes, withnofurther considerationof the internal structure. (With the exception of executable binary programs, which it must know how to load and find the first executable stateme nt, etc.)  Macintosh files have two forks - a resource fork, and a datafork. The resource forkcontains informationrelatingto the UI, such as icons and button images, and can be modified independently of the data fork, which contains the code or data as appropriate. Internal File Structure  Diskfiles are accessed in units of physical blocks, typically512 bytes or some power-of-twomultiple thereof. (Larger physical disks use larger block sizes, to keep the range of block numbers within the range of a 32-bit integer.)  Internallyfiles are organizedinunits oflogical units, which maybe as small as a single byte, or maybe a larger size corresponding to some data record or structure size. The number of logical units which fit into one physical block determines its packing, and has animpact on the amount of internal fragmentation(wasted space) that occurs. As a general rule, half a physicalblockis wastedfor eachfile, and the larger the block sizes the more space is lost to internal fragmentation. ACCESS METHODS  Sequential Access: A sequentialaccessfile emulates magnetic tape operation, andgenerallysupports a few operations: a)readnext - read a record and advance the tape to the next position. b) write next - write a record andadvance the tape to the next position. c) rewind d) skipn records - Mayor maynot be supported. N may be limited to positive numbers, or may be limited to +/- 1.  Direct Access: Jump to anyrecord andread that record. Operations supportedinclude: read n - readrecord number n. (Note an argument is nowrequired.) write n - write recordnumber n. (Note an argument is now required.) jump to record n - could be 0 or the end of file. Query current record - used to return back to this record later. Sequential access can be easily emulated using direct access. The inverse is complicated and inefficient.  Other Access Methods: An indexed access scheme canbe easilybuilt on top of a direct accesssystem. Verylarge files mayrequire a multi-tieredindexingscheme, i.e. indexes of indexes. (Lot of cool and relevant content is there in the book for all chapters) DIRECTORY AND DISK STRUCTURE  Storage Structure: A disk can be used in its entirety for a file system. Alternativelya physical diskcanbe broken upinto multiple partitions, slices, or mini-disks, each of which becomes a virtual disk and can have its own filesystem. (or be usedfor raw storage, swapspace, etc.) Or, multiple physical disks can be combinedintoone volume, i.e. a larger virtual disk, with its own filesystem spanning the physical disks.  Directory Overview: Directoryoperations to be supported include: a) Search for a file, b)Create a file (addto the directory) C) Delete a file (erase from the
  • 3.
    3 File-System Interface(Galvin Notes,9th Ed.) directory) d) List a directory(possiblyorderedin different ways) e)Rename a file (maychange sorting order) f) Traverse the file system.  Single-Level Directory: Simple to implement, but each file must have a unique name.  Two-Level Directory: Each user gets their own directoryspace. File names only need to be unique within a given user's directory. A master file directory is used to keep track of each users directory, and must be maintained when users are added to or removed from the system. A separate directory is generally needed for system (executable) files. Systems mayor maynot allowusers to access other directories besides their ownIf access to other directories is allowed, then provision must be made to specifythe directorybeing accessed. Ifaccessis denied, then special considerationmust be made for users to run programs locatedinsystemdirectories. A searchpath is the list of directories in which to search for executable programs, and can be set uniquely for each user.  Tree-Structured Directories: This is an obvious extension to the two-tiereddirectorystructure. Eachuser / process has the concept of a current directoryfrom which all (relative) searches take place. Files may be accessed using either absolute pathnames (relative to the root of the tree) or relative pathnames (relative to the current directory.) Directories are storedthe same as any other file in the system, except there is a bit that identifies them as directories, andtheyhave some special structure that the OS understands.  Acyclic-Graph Directories: When the same files needto be accessed in more thanone place inthe directorystructure (e.g. because they are being shared by more than one user / process), it can be useful to provide an acyclic-graph structure. UNIXprovidestwo types of links for implementing the acyclic-graph structure. A hardlink (usuallyjust called a link) involves multiple directoryentries that bothrefer to the same file. Hardlinks are onlyvalid for ordinaryfiles in the same filesystem. A symbolic link, that involves a special file, containing information about where to find the linkedfile. Symbolic l inks maybe used to link directories and/or files in other filesystems, as well as ordinary files in the current filesystem. Windows only supports symbolic links, termed short cuts. Hard links require a reference count, or linkcount for each file, keeping track of howmanydirectory entries are currently referring to this file. Whenever one ofthe references is removed the link count is reduced, and whenit reaches zero, the diskspace can be reclaimed.  General-Graph Directory: If cycles are allowedinthe graphs, thenseveralproblems canarise: Search algorithms cango intoinfinite loops. One solutionis to not followlinks in searchalgorithms. (Or not to follow symbolic links, and to only allow symbolic links to refer to directories.) Sub-treescanbecome disconnectedfromthe rest of the tree and still not have their reference counts reduced to zero. Periodic garbage collection is required to detect and resolve thisproblem. (chkdsk in DOS and fsck in UNIX search for these problems, among others, eventhoughcycles are not supposedto be allowedineither system. Disconnecteddiskblocks that are not marked as free are added back to the file systems with made -up file names, and can usually be safely deleted.). Refer Figure 11.3 FILE SYSTEM MOUNTING The basic idea behind mounting file systems is to combine multiple file systems intoone large tree structure. The mount command is given a filesystem to mount anda mount point (directory) on which to attach it. Once a file system is mountedontoa mount point, anyfurther references to that directoryactuallyrefer to the root of the mountedfile system. Any files (or sub-directories)that hadbeenstored inthe mount point directory prior to mounting the newfilesystem are now hidden by the mounted filesystem, and are no longer available. For this reason some systems only allow mounting onto empty directories. Filesystems canonlybe mounted by root, unless root has previously
  • 4.
    4 File-System Interface(Galvin Notes,9th Ed.) configured certainfilesystems to be mountable ontocertainpre-determinedmount points. (E.g. root mayallow users to mount floppyfilesystems to /mnt or something like it.) Anyone canrunthe mount commandto see what filesystems are currentlymounted. Filesystems mayb e mounted read- only, or have other restrictions imposed. The traditional Windows OS runs anextendedtwo-tier directorystructure, where the first tier of the structure separatesvolumesbydrive letters, and a tree structure is implemented belowthat level. Macintoshruns a similar system, where each new volume that is found is automatically mountedandaddedto the desktopwhenit is found. More recent Windows systems allow filesystems to be mounted to any directo ry in the filesystem, much like UNIX. FILE SHARING  Multiple Users: On a multi-user system, more informationneeds to be stored for eachfile: The owner (user)whoowns the file, and who can control its access. The groupof other user IDs that mayhave some special access to the file. What access rights are afforded to the owner (User), the Group, andto the rest of the world (the universe, a.k.a. Others.) Some systems have more complicated acces s control, allowing or denying specific accesses to specifically named users or groups.  Remote File Systems: The advent ofthe Internet introduces issuesfor accessing files storedonremote computers The original methodwas ftp, allowing individual files to be transported across systems as needed. Ftp can be either account and password controlled, or anonymous, not requiring anyuser name or password. Various forms of distributedfile systems allow remote filesystems to be mounted onto a local directorystructure, andaccessed using normal file access commands. (The actual filesare still transported across the network as needed, possiblyusingftp as the underlying transport mechanism.) The WWW has made it easyonce again to access files on remote systems without mounting their filesystems, generally using (anonymous) ftp as the underlying file transport mechanism.  The Client-Server Model: When one computer system remotelymounts a filesystem that is physically located on another system, the systemwhichphysicallyowns the filesacts as a server, andthe system whichmounts them is the client. User IDs and group I Ds must be consistent across bothsystems for the systemto work properly. (I.e. this is most applicable across multiple computers manag ed by the same organization, shared bya commongroupof users.) The same computer can be both a client and a server. (E.g. cross-linked file systems.). The NFS (Network File System) is a classic example of such a system.  Distributed Information Systems: The DomainName System, DNS, provides for a unique naming system across all of the Internet. Domainnames are maintainedby the Network Information System, NIS. Microsoft's Common Internet File System, CIFS, establishes a network login for each user ona networked system withsharedfile access. Older Windows systems useddomains, andn ewer systems (XP, 2000), use active directories. User names must match acrossthe network for thissystemto be valid. A newer approach is the Lightweight Directory-Access Protocol, LDAP, which provides a secure single sign-onfor all users to accessall resources on a network. This is a secure systemwhichis gaininginpopularity, andwhich has the maintenance advantage of combining authorization information in one central location.  Consistency Semantics: Consistency Semantics dealswith the consistencybetweenthe views of sharedfileson a networkedsystem. When one user changes the file, when do other users see the changes? PROTECTION Files must be kept safe for reliability(against accidental damage), and protection (against deliberate malicious access.) The former is usually managed with backup copies. This section discusses the latter.  Types of Access: The following low-level operations are often controlled: o Read - View the contents of the file o Write - Change the contents of the file. o Execute - Load the file onto the CPU and follow the instructions contained therein. o Append - Add to the end of an existing file. o Delete - Remove a file from the system. o List -View the name and other attributes of files on the system. Higher-level operations, such as copy, can generally be performed through combinations of the above.  Access Control: One approach is to have complicated Access Control Lists, ACL, which specifyexactlywhat access is allowedor denied for specific users or groups. The AFS usesthis system for distributed access. Control is very finely adjustable, but may be complicated, particularlywhenthe specific users involved are unknown. (AFSallows some wild cards, so for example all users on a certain remote system may be trusted, or a given username may be trusted when accessing from any remote system.) UNIXuses a set of 9 access control bits, in three groups of three. These correspondto R, W, and Xpermissions for each of the Owner, Group, and Others. (See"manchmod" for full details.) The RWXbits control the following privileges for ordinary files and directories: bit Files Directories R Read (view) file contents. Read directory contents. Required to get a listing of the directory. W Write (change) file contents. Change directory contents. Required to create or delete files. X Execute file contents as a program. Access detailed directory information. Required to get a long listing, or to access any specific file in the directory. Note that if a user has X but not R permissions on a directory, they can still access specific files, but only if they already know the name of the file they are trying to access.
  • 5.
    5 File-System Interface(Galvin Notes,9th Ed.) In addition there are some special bits that canalso be applied:The set user ID (SUID)bit and/or the set group ID (SGID) bits appliedto executable files temporarily change the identity of whoever runs the program to match that of the owner / group of the executable program. Thisallows users running specific programs to have access to files (while running that program) to which theywouldnormallybe unable to access. Setting of these twobits is usuallyrestricted to root, andmust be done with caution, as it introduces a potential security leak. Windows adjusts files access througha simple GUI.  Other Protection Approaches and Issues: o Older systems which didnot originallyhave multi-user file access permissions (DOS and older versions of Mac) must now be retrofitted if they are to share files on a network. o Access to a file requires access to all the files along its pathas well. Ina cyclic directorystructure, users mayhave different access to the same file accessed through different paths. o Sometimesjust the knowledge of the existence ofa file ofa certain name is a security(or privacy) concern. Hence the distinction between the R and X bits on UNIX directories. SUMMARY  A file is an abstract data type definedandimplemented bythe operating system. It is a sequence of logicalrecords. A logicalrecordmaybe a byte, a line (of fixedor variable length), or a more complex data item. The operatingsystemmayspecificallysupport various recordtypes or may leave that support to the application program.  The major taskfor the operating systemis to mapthe logical file concept ontophysical storage devices suchas magnetic diskor tape. Since the physical record size ofthe device maynot be the same as the logicalrecordsize, it may be necessary to orde r logical records into physical records. Again, this task may be supported by the operating system or left for the application program.  Each device ina file systemkeeps a volume table ofcontents or a device directorylistingthe locationof the files o nthe device. Inaddition, it is useful to create directories to allow files to be organized. A single-level directoryina multiuser systemcausesnamingproblems, since each file must have a unique name. A two-level directorysolves this problem bycreating a separate directory for each user’s files. The directorylists the filesbyname andincludesthe file’s locationonthe disk, length, type, owner, time of creation, time o f last use, and so on.  The natural generalizationof a two-level directory is a tree-structured directory. A tree-structured directory allows a user to create subdirectories to organize files. Acyclic-graphdirectorystructures enable users to share subdirectories andfiles but complicate searching and deletion. A general graphstructure allows complete flexibilityinthe sharing offiles anddirectories but sometimes requires garbage collection to recover unused disk space.  Disks are segmentedintoone or more volumes, eachcontaining a file systemor left “raw.” File systems mayb e mounted into the system’s namingstructures to make them available. The naming scheme variesbyoperating system. Once mounted, the files within the vo lume are available for use. File systems may be unmounted to disable access or for maintenance.  File sharing depends onthe semantics providedbythe system. Files mayhave multiple readers, multiple writers, or limits on shari ng. Distributed file systems allowclient hosts to mount volumes or directories fromservers, as longas they can access each oth er across a network. Remote file systems present challenges in reliability, performance, andsecurity. Distributed informationsystems maintain user, host, and access information so that clients and servers can share state information to manage use and acces s.  Since files are the maininformation-storage mechanism inmost computer systems, file protection is needed. Access to files can be controlledseparatelyfor eachtype ofaccess—read, write, execute, append, delete, list directory, and so on. File prote ction can be provided by access lists, passwords, or other techniques.