The unix file system
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,320
On Slideshare
1,320
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
78
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html The Unix File System Introduction to the Unix File System The Unix file system is a methodology for logically organizing and storing large quantities of data such that the system is easy to manage. A file can be informally defined as a collection of (typically related) data, which can be logically viewed as a stream of bytes (i.e. characters). A file is the smallest unit of storage in the Unix file system. By contrast, a file system consists of files, relationships to other files, as well as the attributes of each file. File attributes are information relating to the file, but do not include the data contained within a file. File attributes for a generic operating system might include (but are not limited to): a file type (i.e. what kind of data is in the file) a file name (which may or may not include an extension) a physical file size a file owner file protection/privacy capability file time stamp (time and date created/modified) Additionally, file systems provide tools which allow the manipulation of files, provide a logical organization as well as provide services which map the logical organization of files to physical devices. From the beginners perspective, the Unix file system is essentially composed of files and directories. Directories are special files that may contain other files. The Unix file system has a hierarchical (or tree-like) structure with its highest level directory called root (denoted by /, pronounced slash). Immediately below the root level directory are several subdirectories, most of which contain system files. Below this can exist system files, application files, and/or user data files. Similar to the concept of the process parent-child relationship, all files on a Unix system are related to one another. That is, files also have a parent-child existence. Thus, all files (except one) share a common parental link, the top-most file (i.e. /) being the exception. Below is a diagram (slice) of a "typical" Unix file system. As you can see, the top-most directory is / (slash), with the directories directly beneath being system directories. Note that as Unix implementaions and vendors vary, so will this file system hierarchy. However, the organization of most file systems is similar. While this diagram is not all inclusive, the following system files (i.e. directories) are present in most Unix filesystems: bin - short for binaries, this is the directory where many commonly used executable commands reside dev - contains device specific files etc - contains system configuration files home - contains user directories and files lib - contains all library files mnt - contains device files related to mounted devices proc - contains files related to system processes root - the root users home directory (note this is different than /) sbin - system binary files reside here. If there is no sbin directory on your system, these files most likely reside in etc tmp - storage for temporary files which are periodically removed from the filesystem usr - also contains executable commands File Types From a user perspective in a Unix system, everything is treated as a file. Even such devices such as printers and disk drives.1 of 6 10/21/2011 9:20 PM
  • 2. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html How can this be, you ask? Since all data is essentially a stream of bytes, each device can be viewed logically as a file. All files in the Unix file system can be loosely categorized into 3 types, specifically: 1. ordinary files 2. directory files 1 3. device files While the latter 2 may not intuitively seem like files, they are considered "special" files. The first type of file listed above is an ordinary file, that is, a file with no "special-ness". Ordinary files are comprised of streams of data (bytes) stored on some physical device. Examples of ordinary files include simple text files, application data files, files containing high-level source code, executable text files, and binary image files. Note that unlike some other OS implementations, files do not have to be binary Images to be executable (more on this to come). The second type of file listed above is a special file called a directory (please dont call it a folder?). Directory files act as a container for other files, of any category. Thus we can have a directory file contained within a directory file (this is commonly referred to as a subdirectory). Directory files dont contain data in the user sense of data, they merely contain references to the files contained within them. It is perhaps noteworthy at this point to mention that any "file" that has files directly below (contained within) it in the hierarchy must be a directory, and any "file" that does not have files below it in the hierarchy can be an ordinary file, or a directory, albeit empty. The third category of file mentioned above is a device file. This is another special file that is used to describe a physical device, such as a printer or a zip drive. This file contains no data whatsoever, it merely maps any data coming its way to the physical device it describes. 1 Device file types typically include: character device files, block device files, Unix domain sockets, named pipes and symoblic links. However, not all of these file types may be present across various Unix implementations. File System Navigation To begin our discussion of navigating or moving around the file system, the concept of file names must be introduced. It is, after all, the name of a file that allows for its manipulation. In simplest terms, a file name is a named descriptor for a file. It must consist of one or more characters (with the maximum of 255), where the characters can be almost any character that can be interpreted by the shell (except for / since this has a special meaning). However it is stongly advised that you use file names that are descriptive of the function of the file in question. By rule, Unix file names do not have to have ending extensions (such as .txt or .exe) as do some other operating systems. However, certain applications with which you interact may require extensions, such as Adobes Acrobat Reader (.pdf) or a web browser (.html). And as always character case matters (dont tell me you have forgotten this already?). Thus the following are all valid Unix file names (note these may be any file type): My_Stuff a file called My_Stuff my_stuff a different file than above mortgage.c a C language program...can you guess what it does? a.out a C language binary executable .profile ksh default startup file 1 While file names are certainly important, there is another important related concept, and that is the concept of a file specification (or file spec for short). A file spec may simply consist of a file name, or it might also include more information about a file, such as where is resides in the overall file system. There are 2 techniques for describing file specifications, absolute and relative. With absolute file specifications, the file specification always begins from the root directory, complete and unambiguous. Thus, absolute file specs always begin with /. For example, the following are all absolute file specs from the diagram above: /etc/passwd /bin /usr/bin /home/mthomas/bin /home/mthomas/class_stuff/foo Note the the first slash indictes the top of the tree (root), but each succeeding slash in the file spec acts merely as a separator. Also note the files named bin in the file specifications of /bin, /usr/bin, and /home/mthomas/bin are different bin files, due to the differing locations in the file system hierarchy. With relative file specifications, the file specification always is related to the users current position or location in the file system. Thus, the beginning (left-most part) of a relative file spec describes either: an ordinary file, which implies the file is contained within the current directory a directory, which implies a child of the current directory (i.e. one level down) a reference to the parent of the current directory (i.e. one level up) What this means then is that a relative file specification that is valid from one file system position is probably not valid from another location. Beginning users often ask "How do I know where I am?" The command to use to find this is the pwd (print working directory) command, which will indicate the users current position (in absolute form) in the file system. As mentioned abpve, part of a relative file specification can be a reference to a parent directory. The way one references a parent (of the current) directory is with the characters .. (pronounced dot dot). These characters (with no separating spaces) describe the parent directory relative to the current directory, again, one directory level up in the file system. The following are examples referencing the diagram above: To identify where we are, we type and the system returns the following: $ pwd [Enter] /home/mthomas/class_stuff Thus the parent of this directory is:2 of 6 10/21/2011 9:20 PM
  • 3. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html /home/mthomas # in absolute form .. # in relative form Looking at another example: $ pwd [Enter] /home/mthomas Thus the parent of this directory is: /home # in absolute form .. # in relative form And one (note there could be many) child of the /home/mthomas directory is: /home/mthomas/bin # in absolute form bin # in relative form So you ask "How the heck do we use this?" One uses this to navigate or move about the file system. Moving about the file system is accomplished by using the cd command, which allows a user to change directories. In the simplest usage of this command, entering $ cd [Enter] will move the user to their "home" or login directory. If the user wishes to change to another directory, the user enters $ cd file_spec [Enter] and assuming file_spec is a valid directory, the users current working directory will now be this directory location. Remember, the file specification can always be a relative or an absolute specification. As before, we type and the system returns the following: $ pwd [Enter] /home/mthomas/class_stuff If we wish to change directories to the /home/mthomas/bin directory, we can type $ cd /home/mthomas/bin [Enter] # absolute, will work from anywhere or $ cd .. [Enter] # relative, move up one directory, i.e. to the parent $ cd bin [Enter] # relative, move down to the bin directory or $ cd ../bin [Enter] # relative, both steps in one file spec Novice users sometimes ask which file specification method should they use, or which one is better. The simple and open ended answer is "it depends." That is it depends upon how long the file specification is; or how easy it is to type, including any special characters; or how familiar one is with the current location in the file system hierarchy, etc. Take some time to navigate the file system of your Unix implementation and see how it compares to the diagram above. 1 Some might choose to call this a path name or path specifier, but I prefer to call this a file specification since all the individual components are files. Additional File Attributes As was learned in the Getting Starting section, to determine (or see) what files are in our current working directory, we use the ls command. This command allows the user to 1 simply list their files. Using this command with the -l (thats ell) option results in output something like the following from root (/) : From the above output, we can observe 7 attribute fields listed for each file. From right to left, the attribute fields are: file name: the name associated with the file (recall, this can be any type of file) modification date: the date the file was last modified, i.e. a "time-stamp". If the file has not been modified within the last year (or six months for Linux), the year of last3 of 6 10/21/2011 9:20 PM
  • 4. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html modification is displayed. 2 size: the size of the file in bytes (i.e. characters). group: associated group for the file owner: the owner of the file number of links: the number of other links associated with this file permission modes: the permissions assigned to the file for the owner, the group and all others. 1 Note this listing is for example purposes and not necessarily an accurate or complete representation of the root (/) directory. 2 This is the number of characters in the file, not necessarily the size on disk, since files are written to disk in 1024 byte blocks. Note also if the file is a directory, this is the size of the structure needed to manage the directory hierarchy. Understanding and Modifying File Permissions If you look closely at the permission field above, you will notice the permission field for each file consists of 10 characters as described by the diagram below: The first (leftmost) character indicates the "type" of the file. Another way to describe this is whether the file has any special attributes associated with it. If it is an ordinary file (i.e. no special attributes), it will have a dash in this first position. If it is a directory file, it will have the letter d in this position. Or, if it is a link to another file it will have the letter l (ell) in this first position. You can see examples of an ordinary file and a directory in the ls -l output above. Other special attributes exist but do not merit discussion here. The next 9 characters are arranged in 3 groups of 3 characters each; that is 3 characters to describe the permissions for the owner of the file, 3 characters to describe the permissions for the group and 3 characters for all other users permissions. The 3 characters indicate whether the particular user has read (denoted by r), write (denoted by w), or execute (denoted by x) permissions on that particular file. Thus in the diagram above, the owner of the file has all available permissions (indicated by rwx), a user belonging to this group has read and execute (indicated by r-x) permissions, and everyone else also has read and execute permissions. Observe if a user does not have a particular permission, a dash will appear instead of the corresponding letter. Changing the permission of a file is accomplished using the chmod command, that is to change file protection modes. Changing the permission for a file (or files) can only be 1 done by the owner of the file (or the root user). The usage of this command (using octal mode ) is chmod ijk file(s) where ijk represent 3 octal numbers (0-7); where i selects the user permissions, j selects the group permissions and k selects the permissions for all others. To set read permission for any grouping (i.e. user, group or other) you add the value of 4 to the respective i, j, or k value, to set write permission, you add 2 and to set execute permission, 2 1 0 you add 1. The values of 4, 2, and 1 are derived from the first three powers of two, i.e. 2 , 2 , 2 respectively. Thus to set the user permissions to rwx, you set i to 7 (4 for read + 2 for write + 1 for execute), to set the group and other permissions to r-x, you set j to 5 (4 + 1) and likewise for k. The command to select this would then look as follows: chmod 755 file_spec(s) To illustrate with another example, if the owner of a file wanted to set read and execute for the user permissions, read only for group permissions and no privileges for all others, the command to set this would be chmod 540 file_spec(s) which would result in protection modes displayed as -r-xr----- (assuming an ordinary file) We can see from these two examples that to allow all permissions, you use the maximum value of 7; to prohibit all permissions, you use the minimum value of 0. As a user, you will not have to remember every combination of 4, 2, and 1; typically you will standard combinations such as 755, 644, 700, etc. The permissions of read, write and execute take on a slightly different meaning with respect to directory files. Thus if a file is a directory: read permission determines if a user can view the files contained in a directory, i.e. list the files in it write permission determines if a user can create new files or delete files in the directory. This allows a user with write permission to a directory to have the ability to delete files in the directory even if they dont have write permissions for the file (see Note: on rm below). Watch out for this gotcha! execute permission determines if the user can move (i.e. cd) into the directory 1 There is another way to use the chmod command called symbolic mode which uses symbolic characters instead of numbers. The author prefers octal mode in agreement with [Kernighan & Pike], "the octal mode is easier to use." See the chmod man page for information using symbolic mode. Additional File System Manipulation Making changes to the file system can be done in many ways such as creating files (including directories), making copies of files, moving files about the file system, and deleting files.4 of 6 10/21/2011 9:20 PM
  • 5. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html Creating new ordinary files is typically done using application programs. These application programs can be as sophisticated as a large computer-aided design programs (CAD) or as simple as a text editing program. It is dependent upon the application program as to the mechanism for opening and saving files to the Unix file system. Creating directory files is done with the command mkdir as follows: mkdir file_spec where the file_spec is any valid file specification. Keep in mind that to create ordinary or directory files, the user must have write permission for the target location. Once files are created and saved, users typically find the need to make copies of various files. Files are copied in Unix using the cp command as follows: cp source_file_spec(s) destination_file_spec(s) where the source and destination file specfication can be any valid file specfication, absolute or relative, as described above. What is important to note here is there must be 2 valid arguments, a source and a destination. Examples of invalid arguments include: syntactically incorrect file specifications the source or destination argument is missing the source does not exist, or is not readable (see permissions above) the destination is not writable Since the cp command mandates 2 distinct arguments, now is a good time to introduce notation to make things a little simpler. This notation uses a single character named "dot", represented by the . (period) character (not to be confused with "dot-dot" (..) above). The dot character has several uses based upon context (more uses of dot here). In this context, dot will denote your current working directory, in other words, where you are. Thus if you wish to copy a file and have the destination be your current working directory and maintain the same file name, you can simply do: cp source_file_spec(s) . The dot is the place holder of the 2nd argument and results in all source files being copied to the current working directory with the exact file names. Note without the dot, there is only one argument which will result in a syntax error by the shell. Moving files is very similar to copying files, the difference being with copy, the source files remain intact while with move, the source files no longer exist in their original location. The command to move files is mv and the syntax is analogous to copy: mv source_file_spec(s) destination_file_spec(s) Some examples using the cp command based upon the diagram above given the working directory as follows: $ pwd [Enter] /home/stu1 $ cp /home/mthomas/class_stuff/foo /home/stu1/foo # pure absolute form $ cp ../mthomas/class_stuff/foo foo # relative form, same file name $ cp ../mthomas/class_stuff/foo new_foo # relative, new file name $ cp ../mthomas/class_stuff/foo /home/stu1 # no file specified, original name assumed $ cp ../mthomas/class_stuff/foo . # dot replaces 2nd argument Users can also remove files (including directories) from the file system. The command to delete ordinary files from the file system is rm and its syntax is rm file_spec(s) Note: if a file has no write permission and the standard input comes from a terminal, most newer versions of rm will prompt the user for whether to remove the file (assuming no -f flag). If any yes (or y or Y) response is given, the file will be removed even though write permission is disabled. This behavior can be circumvented by disabling the write permission on the parent directory. Refer to the rm manual pages for additional details. See code example of this here. In similar fashion, users can remove directories from the file system using the rmdir command as follows: rmdir file_spec(s) Keep in mind that both of these commands require valid file specifications as well as sufficient permissions. File System Miscellany In working with the Unix file system, understanding a few miscellaneous concepts can be helpful. The first of these is the capability to specify multiple files. This is accomplished using something called a metacharacter; that is a character that may have special meaning other than the "literal" meaning of the character. The first of these to discuss is called the wildcard character, represented by the * (asterisk) character. Using the * character alone represents (matches) all files in the specified location. For example, the command: $ cp /home/mthomas/class_stuff/* . would copy all files in the class_stuff directory to your current location in the file system tree, keeping all names the same. Note the wildcard character can only be used for the source specification. Similarly, the command: $ chmod 755 * would change the file protections modes to -rwxr-xr-x for all files in your current directory. You can also use the wildcard character to selectively match the names of files. For example, a* would match (or select) all files that start with the letter a. Looking at an example: $ ls net* $ chmod 755 net* the first command would list all files starting with the three characters "net", while the second would change the mode (as mentioned above) for all files starting with these three5 of 6 10/21/2011 9:20 PM
  • 6. The Unix File System http://www.ucblueash.edu/thomas/Intro_Unix_Text/File_System.html characters. Note when using wildcards as part of a filename, no spaces can be between the literal characters and the wildcard character. One should be careful using the wildcard character though, as this can have dangerous results. Can you tell what the following command does? $ rm * Another topic which can aid in working within the file system is using the copy (cp) command when working with directories. If one wishes to copy all files within a directory to another location, one could simply use the command: $ cp * destination_file_spec However, if any of the files to be copied are directories themselves, the cp program will report an error, looking something like: cp: omitting directory `foo The workaround to this is to use the -r (recursive copy) option with cp, as follows: $ cp -r * destination_file_spec Command Summary cat - display entire file(s) in the terminal window (also see more command below) cd - change the current directory cp - copy file(s) chmod - change file(s) protection modes lpr - send file(s) to the line printer mkdir - create a new directory more - nicely view the contents of file(s) mv - move (and/or rename) file(s) pwd - print ablsoute pathname of current working directory rm - remove file(s) rmdir - remove an empty directory Next Section: Intro to Processes Table of Contents ©2003, Mark A. Thomas. All Rights Reserved.6 of 6 10/21/2011 9:20 PM