Operating Systems Slides 7 - File Systems
Upcoming SlideShare
Loading in...5
×
 

Operating Systems Slides 7 - File Systems

on

  • 6,570 views

handouts version:

handouts version:
http://cs2.swfu.edu.cn/~wx672/lecture_notes/os/slides/fs-a.pdf

Statistics

Views

Total Views
6,570
Views on SlideShare
6,568
Embed Views
2

Actions

Likes
2
Downloads
206
Comments
0

1 Embed 2

https://lms.kku.edu.sa 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Operating Systems Slides 7 - File Systems Operating Systems Slides 7 - File Systems Presentation Transcript

  • File SystemsWang XiaolinJune 19, 2013u wx672ster+os@gmail.com1 / 78
  • Long-term Information Storage Requirements▶ Must store large amounts of data▶ Information stored must survive the termination of theprocess using it▶ Multiple processes must be able to access theinformation concurrently2 / 78
  • File-System Structure.File-system design addressing two problems:........1. defining how the FS should look to the user▶ defining a file and its attributes▶ the operations allowed on a file▶ directory structure2. creating algorithms and data structures to map thelogical FS onto the physical disk3 / 78
  • File-System — A Layered DesignAPPs⇓Logical FS⇓File-org module⇓Basic FS⇓I/O ctrl⇓Devices▶ logical file system — managesmetadata information- maintains all of the file-systemstructure (directory structure, FCB)- responsible for protection andsecurity▶ file-organization module- logical blockaddresstranslate−−−−−→ physical blockaddress- keeps track of free blocks▶ basic file system issues genericcommands to device driver, e.g- “read drive 1, cylinder 72, track 2,sector 10”▶ I/O Control — device drivers, and INThandlers- device driver:high-levelcommandstranslate−−−−−→ hardware-specificinstructions4 / 78
  • The Operating StructureAPPs⇓Logical FS⇓File-org module⇓Basic FS⇓I/O ctrl⇓Devices.Example — To create a file........1. APP calls creat()2. Logical FS2.1 allocates a new FCB2.2 updates the in-mem dir structure2.3 writes it back to disk2.4 calls the file-org module3. file-organization module3.1 maps the directory I/O into disk-blocknumbers3.2 allocates blocks for storing the file’sdata.Benefit of layered design........The I/O control and sometimes the basic file system codecan be used by multiple file systems.5 / 78
  • File— A Logical View Of Information Storage.User’s view........A file is the smallest storage unit on disk.▶ Data cannot be written to disk unless they are within a file.UNIX view........Each file is a sequence of 8-bit bytes▶ It’s up to the application program to interpret this bytestream.6 / 78
  • File— What Is Stored In A File?Source code, object files, executable files, shell scripts,PostScript....Different type of files have different structure........▶ UNIX looks at contents to determine typeShell scripts start with “#!”PDF start with “%PDF...”Executables start with magic number▶ Windows uses file naming conventionsexecutables end with “.exe” and “.com”MS-Word end with “.doc”MS-Excel end with “.xls”7 / 78
  • File Naming.Vary from system to system........▶ Name length?▶ Characters? Digits? Special characters?▶ Extension?▶ Case sensitive?8 / 78
  • File TypesRegular files: ASCII, binaryDirectories: Maintaining the structure of the FS.In UNIX, everything is a file........Character special files: I/O related, such as terminals,printers ...Block special files: Devices that can contain file systems,i.e. disksdisks — logically, linear collections ofblocks; disk driver translates theminto physical block addresses9 / 78
  • .Binary files........(a) (b)HeaderHeaderHeaderMagic numberText sizeData sizeBSS sizeSymbol table sizeEntry pointFlagsTextDataRelocationbitsSymboltableObjectmoduleObjectmoduleObjectmoduleModulenameDateOwnerProtectionSize;;;HeaderAn UNIX executable file An UNIX archive10 / 78
  • File Attributes — Metadata▶ Name only information kept in human-readable form▶ Identifier unique tag (number) identifies file within filesystem▶ Type needed for systems that support different types▶ Location pointer to file location on device▶ Size current file size▶ Protection controls who can do reading, writing,executing▶ Time, date, and user identification data for protection,security, and usage monitoring11 / 78
  • File OperationsPOSIX file system calls1. fd = creat(name, mode)2. fd = open(name, flags)3. status = close(fd)4. byte_count = read(fd, buffer, byte_count)5. byte_count = write(fd, buffer, byte_count)6. offset = lseek(fd, offset, whence)7. status = link(oldname, newname)8. status = unlink(name)9. status = truncate(name, size)10. status = ftruncate(fd, size)11. status = stat(name, buffer)12. status = fstat(fd, buffer)13. status = utimes(name, times)14. status = chown(name, owner, group)15. status = fchown(fd, owner, group)16. status = chmod(name, mode)17. status = fchmod(fd, mode)12 / 78
  • .An Example Program Using File System Calls......../* File copy program. Error checking and reporting is minimal. */#include <sys/types.h> /* include necessary header files */#include <fcntl.h>#include <stdlib.h>#include <unistd.h>int main(int argc, char *argv[]); /* ANSI prototype */#define BUF SIZE 4096 /* use a buffer size of 4096 bytes */#define OUTPUT MODE 0700 /* protection bits for output file */int main(int argc, char *argv[]){int in fd, out fd, rd count, wt count;char buffer[BUF SIZE];if (argc != 3) exit(1); /* syntax error if argc is not 3 *//* Open the input file and create the output file */in fd = open(argv[1], O RDONLY); /* open the source file */if (in fd < 0) exit(2); /* if it cannot be opened, exit */out fd = creat(argv[2], OUTPUT MODE); /* create the destination file */if (out fd < 0) exit(3); /* if it cannot be created, exit *//* Copy loop */while (TRUE) {rd count = read(in fd, buffer, BUF SIZE); /* read a block of data */if (rd count <= 0) break; /* if end of file or error, exit loop */wt count = write(out fd, buffer, rd count); /* write data */if (wt count <= 0) exit(4); /* wt count <= 0 is an error */}/* Close the files */close(in fd);close(out fd);if (rd count == 0) /* no error on last read */exit(0);elseexit(5); /* error on last read */}13 / 78
  • open().fd open(pathname, flags)........A per-process open-file table is kept in the OS▶ upon a successful open() syscall, a new entry is added intothis table▶ indexed by file descriptor (fd)To see files opened by a process, e.g. init∼$ lsof -p 1.Why open() is needed?........To avoid constant searching▶ Without open(), every file operation involves searchingthe directory for the file.14 / 78
  • Directories— Single-Level Directory SystemsAll files are contained in the same directory.......Root directoryA A B CFig. 6-7. A single-level directory system containing four files,owned by three different people, A, B, and C.- contains 4 files- owned by 3 differentpeople, A, B, and C.Limitations........- name collision- file searchingOften used on simple embedded devices, such as telephone,digital cameras...15 / 78
  • Directories— Two-level Directory Systems.A separate directory for each user........FilesUserdirectoryA AA BBCCC CRoot directoryFig. 6-8. A two-level directory system. The letters indicate theowners of the directories and files.Limitation: hard to access others files16 / 78
  • Directories— Hierarchical Directory SystemsUserdirectoryUser subdirectoriesC CCC CCBBAABBC CCBRoot directoryUser fileFig. 6-9. A hierarchical directory system. 17 / 78
  • Directories— Path NamesROOTbin boot dev etc home vargrub passwd staff stud mailw x 6 7 2 20081152001dirfile2008115200118 / 78
  • Directories— Directory OperationsCreate Delete Rename LinkOpendir Closedir Readdir Unlink19 / 78
  • File System Implementation.A typical file system layout........|<---------------------- Entire disk ------------------------>|+-----+-------------+-------------+-------------+-------------+| MBR | Partition 1 | Partition 2 | Partition 3 | Partition 4 |+-----+-------------+-------------+-------------+-------------+_______________________________/ ____________/ +---------------+-----------------+--------------------+---//--+| Boot Ctrl Blk | Volume Ctrl Blk | Dir Structure | Files || (MBR copy) | (Super Blk) | (inodes, root dir) | dirs |+---------------+-----------------+--------------------+---//--+|<-------------Master Boot Record (512 Bytes)------------>|0 439 443 445 509 511+----//-----+----------+------+------//---------+---------+| code area | disk-sig | null | partition table | MBR-sig || 440 | 4 | 2 | 16x4=64 | 0xAA55 |+----//-----+----------+------+------//---------+---------+20 / 78
  • On-Disk Information StructureBoot control block a MBR copyUFS: boot blockNTFS: partition boot sectorVolume control block Contains volume detailsnumber of blocks size of blocksfree-block count free-block pointersfree FCB count free FCB pointersUFS: superblockNTFS: Master File TableDirectory structure Organizes the files FCBFile controlblock (FCB) contains file details (metadata).UFS: i-nodeNTFS: stored in MFT using a relatiional databasestructure, with one row per file21 / 78
  • Each File-System Has a SuperblockSuperblock keeps information about the file system:▶ Type — ext2, ext3, ext4...▶ Size▶ Status — how it’s mounted, free blocks, free inodes, ...▶ Information about other metadata structures∼# dumpe2fs /dev/sda1 | grep -i superblock22 / 78
  • Implementing FilesContiguous Allocationalso easy to retrieve a single block. For example, if a file starts at block b, and the ithblock of the file is wanted, its location on secondary storage is simply b ϩ i Ϫ 1. Con-tiguous allocation presents some problems. External fragmentation will occur, mak-ing it difficult to find contiguous blocks of space of sufficient length. From time totime, it will be necessary to perform a compaction algorithm to free up additional0 1 2 3 45 6 7File AFile Allocation TableFile BFile CFile EFile D8 910 11 12 13 1415 16 17 18 1920 21 22 23 2425 26 27 28 2930 31 32 33 34File NameFile AFile BFile CFile DFile E2918302635823Start Block LengthFigure 12.7 Contiguous File Allocation- simple;- good for read only;- fragmentation23 / 78
  • Linked List (Chained) Allocation A pointer in each diskblockjust a single entry for each file, showing the starting block and the length of the file.Although preallocation is possible, it is more common simply to allocate blocks asneeded.The selection of blocks is now a simple matter: any free block can be addedto a chain. There is no external fragmentation to worry about because only oneFigure 12.9 Chained Allocation0 1 2 3 45 6 7File Allocation TableFile B8 910 11 12 13 1415 16 17 18 1920 21 22 23 2425 26 27 28 2930 31 32 33 34File BFile Name Start Block Length1 5- no wasteblock;- slow randomaccess;- not 2n24 / 78
  • Linked List (Chained) Allocation Though there is noexternal fragmentation, consolidation is stillpreferred.574 CHAPTER 12 / FILE MANAGEMENTblock at a time is needed.This type of physical organization is best suited to sequen-tial files that are to be processed sequentially. To select an individual block of a file0 1 2 3 45 6 7File Allocation TableFile B8 910 11 12 13 1415 16 17 18 1920 21 22 23 2425 26 27 28 2930 31 32 33 34File BFile Name Start Block Length0 5Figure 12.10 Chained Allocation (After Consolidation)L6329_06_SE_C12.QXD 2/21/08 9:40 PM Page 57425 / 78
  • FAT: Linked list allocation with a table in RAM.......▶ Taking the pointer out of eachdisk block, and putting it into atable in memory▶ fast random access (chain is inRAM)▶ is 2n▶ the entire table must be in RAMdisk ↗⇒ FAT ↗⇒ RAMused ↗PhysicalblockFile A starts hereFile B starts hereUnused block012345678910111213141510117321214-1-1Fig. 6-14. Linked list allocation using a file allocation table inmain memory.26 / 78
  • Indexed Allocation 12.6 / SECONDARY STORAGE MANAGEMENT 575Bit Tables This method uses a vector containing one bit for each block on thedisk. Each entry of a 0 corresponds to a free block, and each 1 corresponds to ablock in use. For example, for the disk layout of Figure 12.7, a vector of length 35 isneeded and would have the following value:00111000011111000011111111111011000Figure 12.11 Indexed Allocation with Block Portions0 1 2 3 45 6 7File Allocation TableFile B8 910 11 12 13 1415 16 17 18 1920 21 22 23 2425 26 27 28 2930 31 32 33 34File BFile Name Index Block241831428▶ i-node: a data structure for each file▶ an i-node is in memory only if the file is openfilesopened ↗ ⇒ RAMused ↗27 / 78
  • I-node — FCB in UNIXFile inode (128B)Type ModeUser ID Group IDFile size # blocks# links FlagsTimestamps (×3)Triple indirectDouble indirectSingle indirectDirect blocks (×12)k #s ofctoryksFile data blockDataFile type Description0 Unknown1 Regular file2 Directory3 Character device4 Block device5 Named pipe6 Socket7 Symbolic linkMode: 9-bit pattern28 / 78
  • Inode QuizGiven:block size is 1KBpointer size is 4BAddressing:byte offset 9000byte offset 350,000+----------------+0 | 4096 |+----------------+ ---->+----------+ Byte 9000 in a file1 | 228 | / | 367 | |+----------------+ / | Data blk | v2 | 45423 | / +----------+ 8th blk, 808th byte+----------------+ /3 | 0 | / -->+------++----------------+ / / 0| |4 | 0 | / / +------++----------------+ / / : : :5 | 11111 | / / +------+ Byte 350,000+----------------+ / ->+-----+/ 75| 3333 | in a file6 | 0 | / / 0| 331 | +------+ |+----------------+ / / +-----+ : : : v7 | 101 | / / | | +------+ 816th byte+----------------+/ / | : | 255| | -->+----------+8 | 367 | / | : | +------+ | 3333 |+----------------+ / | : | 331 | Data blk |9 | 0 | / | | Single indirect +----------++----------------+ / +-----+S | 428 (10K+256K) | / 255| |+----------------+/ +-----+D | 9156 | 9156 /***********************+----------------+ Double indirect What about the ZEROs?T | 824 | ***********************/+----------------+29 / 78
  • UNIX In-Core Data Structure▶ mount table — info about each mounted FS▶ directory-structure cache holds the dir-info of recentlyaccessed dirs▶ inode table — an in-core version of the on-disk inodetable▶ file table▶ global▶ keeps inode of each open file▶ keeps track of▶ how many processes are associated with each open file▶ where the next read and write will start▶ access rights▶ user file descriptor table▶ per process▶ identifies all open files for a process30 / 78
  • UNIX In-Core Data Structure.open()/creat()........1. add entry in each table2. returns a file descriptor — an index into the user filedescriptor table31 / 78
  • A File Is Opened By Multiple Processes?.Two levels of internal tables in the OS........A per-process table tracks all files that a process has open.Stores▶ the current-file-position pointer (not really)▶ access rights▶ more...a.k.a file descriptor tableA system-wide table keeps process-independentinformation, such as▶ the location of the file on disk▶ access dates▶ file size▶ file open count — the number of processesopening this file32 / 78
  • Per-process FDTProcess 1+------------------+ System-wide| ... | open-file table+------------------+ +------------------+| Position pointer | | ... || Access rights | +------------------+| ... | | ... |+------------------+ +------------------+| ... | --------->| Location on disk |+------------------+ | R/W || Access dates |Process 2 | File size |+------------------+ | Pointer to inode || Position pointer | | File-open count || Access rights |----------->| ... || ... | +------------------++------------------+ | ... || ... | +------------------++------------------+33 / 78
  • .A process executes the following code:........fd1 = open(”/etc/passwd”, O_RDONLY);fd2 = open(”local”, O_RDWR);fd3 = open(”/etc/passwd”, O_WRONLY);user FDT file table inode table+--------+ +-----------+ +---------------+0| STDIN | | : | | : |+--------+ +-----------+ | : |1| STDOUT | | count R | | : |+--------+ -->| 1 | +---------------+2| STDERR | / +-----------+ ‘---->| (/etc/passwd) |+--------+/ | : | ,-->| count 2 |3| | +-----------+ | +---------------++--------+ | count RW | | | : |4| |---->| 1 | / +---------------++--------+ +-----------+ / | (local) |5| | | : | /--->| count 1 |+--------+ +-----------+/ +---------------+: : : | count W | | : |+--------+ -->| 1 | +---------------++-----------+34 / 78
  • .One more process B:........fd1 = open(”/etc/passwd”, O_RDONLY);fd2 = open(”private”, O_RDONLY);user FDTproc A file table+--------+ +-----------+ inode table0| STDIN | | : | +---------------++--------+ +-----------+ | : |1| STDOUT | | count R | | : |+--------+ ------>| 1 | +---------------+2| STDERR | / +-----------+ --------->| (/etc/passwd) |+--------+/ | : | ----->| count 3 |3| | +-----------+ / ->| |+--------+ | count RW | / / +---------------+4| |-------->| 1 | / / | : |+--------+ +-----------+ / / | : |5| | | : | / / +---------------++--------+ +-----------+/ -------->| (local) |: : : ---->| count R | / | count 1 |+--------+ / | 1 | / +---------------+/ +-----------+ / | : |proc B | | : | / | : |+--------+ | +-----------+/ +---------------+0| STDIN | | -->| count W | ------->| (private) |+--------+ | | 1 | / | count 1 |1| STDOUT | | +-----------+ / +---------------++--------+ | | : | / | : |2| STDERR | / +-----------+/ | : |+--------+/ | count R | +---------------+3| | .------>| 1 |+--------+/ +-----------+4| |+--------+: : :+--------+35 / 78
  • Why File Table?To allow a parent and child to share a file position, but toprovide unrelated processes with their own values.Modei-nodeLink countUidGidFile sizeTimesAddresses offirst 10disk blocksSingle indirectDouble indirectTriple indirectParent’sfiledescriptortableChild’sfiledescriptortableUnrelatedprocessfiledescriptortableOpen filedescriptionFile positionR/WPointer to i-nodeFile positionR/WPointer to i-nodePdi‘36 / 78
  • Why File Table?.Where To Put File Position Info?........Inode table? No. Multiple processes can open the same file.Each one has its own file position.User file descriptor table? No. Trouble in file sharing..Example........#!/bin/bashecho helloecho worldWhere should the “world” be?∼$ ./hello.sh > A37 / 78
  • Implementing Directories(a)gamesmailnewsworkattributesattributesattributesattributesData structurecontaining theattributes(b)gamesmailnewsworkFig. 6-16. (a) A simple directory containing fixed-size entries withthe disk addresses and attributes in the directory entry. (b) A direc-tory in which each entry just refers to an i-node.(a) A simple directory (Windows)▶ fixed size entries▶ disk addresses and attributes in directory entry(b) Directory in which each entry just refers to an i-node(UNIX)38 / 78
  • How Long A File Name Can Be?File 1 entry lengthFile 1 attributesPointer to file 1s nameFile 1 attributesPointer to file 2s nameFile 2 attributesPointer to file 3s nameFile 2 entry lengthFile 2 attributesFile 3 entry lengthFile 3 attributespebercutotdj-gpebercutotdj-gpe r s on n e lf o opolenrnf o oseEntryfor onefileHeapEntryfor onefile(a) (b)File 3 attributes39 / 78
  • UNIX Treats a Directory as a FileDirectory inode (128B)Type ModeUser ID Group IDFile size # blocks# links FlagsTimestamps (×3)Triple indirectDouble indirectSingle indirectDirect blocks (×12)...passwdfstab… …Directory blockFile inode (128B)Type ModeUser ID Group IDFile size # blocks# links FlagsTimestamps (×3)Triple indirectDouble indirectSingle indirectDirect blocks (×12)Indirect blockinode #inode #inode #inode #Direct blocks (×512)Block #s ofmoredirectoryblocksBlock # ofblock with512 singleindirectentriesBlock # ofblock with512 doubleindirectentriesFile data blockData.Example......... 2.. 2bin 11116545boot 2cdrom 12dev 3......40 / 78
  • .The steps in looking up /usr/ast/mbox........Root directoryI-node 6is for /usrBlock 132is /usrdirectoryI-node 26is for/usr/astBlock 406is /usr/astdirectoryLooking upusr yieldsi-node 6I-node 6says that/usr is inblock 132/usr/astis i-node26/usr/ast/mboxis i-node60I-node 26says that/usr/ast is inblock 406114714968...bindevlibetcusrtmp611930512645dickerikjimastbal2666492608117grantsbooksmboxminixsrcModesizetimes132Modesizetimes40641 / 78
  • File Sharing— Multiple UsersUser IDs identify users, allowing permissions andprotections to be per-userGroup IDs allow users to be in groups, permitting groupaccess rights.Example: 9-bit pattern........owner access 7 ⇒ rwx1 1 1group access 5 ⇒ r−x1 0 1public access 0 ⇒ −−−0 0 042 / 78
  • File Sharing— Remote File SystemsUses networking to allow file system access betweensystems▶ Manually via programs like FTP▶ Automatically, seamlessly using distributed file systems▶ Semi automatically, via the world wide webClient-server model allows clients to mount remote filesystems from servers▶ NFS — standard UNIX client-server file sharing protocol▶ CIFS — standard Windows protocol▶ Standard system calls are translated into remote callsDistributed Information Systems (distributed namingservices)▶ such as LDAP, DNS, NIS, Active Directory implementunified access to information needed for remotecomputing43 / 78
  • File Sharing— Protection▶ File owner/creator should be able to control:▶ what can be done▶ by whom▶ Types of access▶ Read▶ Write▶ Execute▶ Append▶ Delete▶ List44 / 78
  • Shared Files— Hard Links vs. Soft LinksRoot directoryBB B CC CCAB CB? C C CAShared fileFig. 6-18. File system containing a shared file. 45 / 78
  • .Hard Links........Hard links
  • the same inode46 / 78
  • .Drawback........Cs directory Bs directory Bs directoryCs directoryOwner = CCount = 1Owner = CCount = 2Owner = CCount = 1(a) (b) (c)47 / 78
  • .Symbolic Links........A symbolic link has its own inode
  • a directory entry.48 / 78
  • Disk Space Management— Statistics49 / 78
  • ▶ Block size is chosen while creating the FS▶ Disk I/O performance is conflict with space utilization▶ smaller block size ⇒ better space utilization▶ larger block size ⇒ better disk I/O performance∼$ dumpe2fs /dev/sda1 | grep ”Block size”50 / 78
  • Keeping Track of Free Blocks1. Linked List10.5 Free-Space Management 4430 1 2 34 5 78 9 10 1112 13 1416 17 18 1920 21 22 2324 25 26 2728 29 30 31156st head10 Linked free-space list on disk.2. Bit map (n blocks)0 1 2 3 4 5 6 7 8 .. n-1+-+-+-+-+-+-+-+-+-+-//-+-+|0|0|1|0|1|1|1|0|1| .. |0|+-+-+-+-+-+-+-+-+-+-//-+-+bit[i] ={0 ⇒ block[i] is free1 ⇒ block[i] is occupied51 / 78
  • Journaling File Systems.Operations required to remove a file in UNIX:........1. Remove the file from its directory- set inode number to 02. Release the i-node to the pool of free i-nodes- clear the bit in inode bitmap3. Return all the disk blocks to the pool of free disk blocks- clear the bits in block bitmapWhat if crash occurs between 1 and 2, or between 2 and 3?52 / 78
  • Journaling File Systems.Keep a log of what the file system is going to dobefore it does it........▶ so that if the system crashes before it can do its plannedwork, upon rebooting the system can look in the log tosee what was going on at the time of the crash andfinish the job.▶ NTFS, EXT3, and ReiserFS use journaling among others53 / 78
  • Ext2 File System.Physical Layout........+------------+---------------+---------------+--//--+---------------+| Boot Block | Block Group 0 | Block Group 1 | | Block Group n |+------------+---------------+---------------+--//--+---------------+__________________________/ _____________/ +-------+-------------+------------+--------+-------+--------+| Super | Group | Data Block | inode | inode | Data || Block | Descriptors | Bitmap | Bitmap | Table | Blocks |+-------+-------------+------------+--------+-------+--------+1 blk n blks 1 blk 1 blk n blks n blks54 / 78
  • Ext2 Block groups.The partition is divided into Block Groups........▶ Block groups are same size — easy locating▶ Kernel tries to keep a file’s data blocks in the sameblock group — reduce fragmentation▶ Backup critical info in each block group▶ The Ext2 inodes for each block group are kept in theinode table▶ The inode-bitmap keeps track of allocated andunallocated inodes55 / 78
  • .Group descriptor........▶ Each block group has a group descriptor▶ All the group descriptors together make the groupdescriptor table▶ The table is stored along with the superblock▶ Block Bitmap: tracks free blocks▶ Inode Bitmap: tracks free inodes▶ Inode Table: all inodes in this block group▶ Free blocks count, Free Inodes count, Used directorycount: counters▶ see more: ∼# dumpe2fs /dev/sda156 / 78
  • Ext2 Block Allocation PoliciesChapter 15 The Linux Systemallocating scattered free blocksallocating continuous free blocksblock in use bit boundaryblock selectedby allocatorfree block byte boundarybitmap searchFigure 15.9 ext2fs block-allocation policies.57 / 78
  • MathsGiven block size = 4kblock bitmap = 1 blk , thenblocks per group = 8bits × 4k = 32kHow large is a group?group size = 32k × 4k = 128 MBHow many block groups are there?≈partition sizegroup size=partition size128MHow many files can I have in max?≈partition sizeblock size=partition size4k58 / 78
  • Ext2 inode59 / 78
  • .Ext2 inode........Mode: holds two pieces of information1. Is it a{file|dir|sym-link|blk-dev|char-dev|FIFO}?2. PermissionsOwner info: Owners’ ID of this file or directorySize: The size of the file in bytesTimestamps: Accessed, created, last modified timeDatablocks: 15 pointers to data blocks (12 + S + D + T)60 / 78
  • .Max File Size........Given: {block size = 4kpointer size = 4B,We get:Max File Size = number of pointers × block size= (number of pointers12direct+ 1k1−indirect+ 1k × 1k2−indirect+ 1k × 1k × 1k3−indirect) × 4k= 48k + 4M + 4G + 4T61 / 78
  • Ext2 Superblock▶ Magic Number: 0xEF53▶ Revision Level: determines what new features areavailable▶ Mount Count and Maximum Mount Count: determines ifthe system should be fully checked▶ Block Group Number: indicates the block group holdingthis superblock▶ Block Size: usually 4k▶ Blocks per Group: 8bits × block size▶ Free Blocks: System-wide free blocks▶ Free Inodes: System-wide free inodes▶ First Inode: First inode number in the file system▶ see more: ∼# dumpe2fs /dev/sda162 / 78
  • Ext2 File TypesFile type Description0 Unknown1 Regular file2 Directory3 Character device4 Block device5 Named pipe6 Socket7 Symbolic linkDevice file, pipe, and socket: No data blocks are required.All info is stored in the inodeFast symbolic link: Short path name (< 60 chars) needs nodata block. Can be stored in the 15 pointerfields63 / 78
  • Ext2 Directories0 11|12 23|24 39|40+----+--+-+-+----+----+--+-+-+----+----+--+-+-+----+----+--//--+| 21 |12|1|2|. | 22 |12|2|2|.. | 53 |16|5|2|hell|o | |+----+--+-+-+----+----+--+-+-+----+----+--+-+-+----+----+--//--+,--------> inode number| ,---> record length| | ,---> name length| | | ,---> file type| | | | ,----> name+----+--+-+-+----+0 | 21 |12|1|2|. |+----+--+-+-+----+12| 22 |12|2|2|.. |+----+--+-+-+----+----+24| 53 |16|5|2|hell|o |+----+--+-+-+----+----+40| 67 |28|3|2|usr |+----+--+-+-+----+----+52| 0 |16|7|1|oldf|ile |+----+--+-+-+----+----+68| 34 |12|4|2|sbin|+----+--+-+-+----+▶ Directories are special files▶ “.” and “..” first▶ Padding to 4 ×▶ inode number is 0 — deletedfile64 / 78
  • Many different FS are in use.Windows........uses drive letter (C:, D:, ...) to identify each FS.UNIX........integrates multiple FS into a single structure▶ From user’s view, there is only one FS hierarchy∼$ man fs65 / 78
  • Virtural File Systems.Put common parts of all FS in a separate layer........▶ It’s a layer in the kernel▶ It’s a common interface to several kinds of file systems▶ It calls the underlying concrete FS to actual manage thedataUserprocessFS 1 FS 2 FS 3Buffer cacheVirtual file systemFilesystemVFS interfacePOSIX66 / 78
  • 67 / 78
  • .Virtual File System........▶ Manages kernel level file abstractions in one format forall file systems▶ Receives system call requests from user level (e.g.write, open, stat, link)▶ Interacts with a specific file system based on mountpoint traversal▶ Receives requests from other parts of the kernel, mostlyfrom memory management.Real File Systems........▶ managing file & directory data▶ managing meta-data: timestamps, owners, protection,etc.▶ disk data, NFS data... translate←−−−−−−−−−→VFS data68 / 78
  • File System Mounting/a b acp q r q q rd/c dbDiskette/Hard diskHard diskx y zx y zFig. 10-26. (a) Separate file systems. (b) After mounting.69 / 78
  • A FS must be mounted before it can be used.Mount — The file system is registered with theVFS........▶ The superblock is read into the VFS superblock▶ The table of addresses of functions the VFS requires isread into the VFS superblock▶ The FS’ topology info is mapped onto the VFSsuperblock data structure.The VFS keeps a list of the mounted file systemstogether with their superblocks........The VFS superblock contains:▶ Device, blocksize▶ Pointer to the root inode▶ Pointer to a set of superblock routines▶ Pointer to file_system_type data structure▶ more...70 / 78
  • V-node▶ Every file/directory in the VFS has a VFS inode, kept inthe VFS inode cache▶ The real FS builds the VFS inode from its own info.Like the EXT2 inodes, the VFS inodes describe........▶ files and directories within the system▶ the contents and topology of the Virtual File System71 / 78
  • VFS Operation...Processtable0Filedescriptors...V-nodesopenreadwriteFunctionpointers...24VFSReadfunctionFS 1Call fromVFS intoFS 172 / 78
  • Linux VFS.The Common File Model........All other filesystems must map their own concepts into thecommon file modelFor example, FAT filesystems do not have inodes.▶ The main components of the common file model are- superblock – information about mounted filesystem- inode – information about a specific file- file – information about an open file- dentry – information about directory entry▶ Geared toward Unix FS73 / 78
  • .The Superblock Object........▶ is implemented by each FS and is used to storeinformation describing that specific FS▶ usually corresponds to the filesystem superblock or thefilesystem control block▶ Filesystems that are not disk-based (such as sysfs, proc)generate the superblock on-the-fly and store it inmemory▶ struct super_block in <linux/fs.h>▶ s_op in struct super_block
  • struct super_operations —the superblock operations table▶ Each item in this table is a pointer to a function thatoperates on a superblock object74 / 78
  • .The Inode Object........▶ For Unix-style filesystems, this information is simplyread from the on-disk inode▶ For others, the inode object is constructed in memory inwhatever manner is applicable to the filesystem▶ struct inode in <linux/fs.h>▶ An inode represents each file on a FS, but the inodeobject is constructed in memory only as files areaccessed▶ includes special files, such as device files or pipes▶ i_op
  • struct inode_operations75 / 78
  • .The Dentry Object........▶ components in a path▶ makes path name lookup easier▶ struct dentry in <linux/dcache.h>▶ created on-the-fly from a string representation of a pathname76 / 78
  • .Dentry State........▶ used▶ unused▶ negative.Dentry Cache........consists of three parts:1. Lists of “used” dentries2. A doubly linked “least recently used” list of unused andnegative dentry objects3. A hash table and hashing function used to quicklyresolve a given path into the associated dentry object77 / 78
  • .The File Object........▶ is the in-memory representation of an open file▶ open() ⇒ create; close() ⇒ destroy▶ there can be multiple file objects in existence for thesame file▶ Because multiple processes can open and manipulate afile at the same time▶ struct file in <linux/fs.h>78 / 78