Download It


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Download It

  1. 1. Operating System Allen C.-H. Wu Department of Computer Science Tsing Hua University
  2. 2. Part I: Overview Ch. 1 Introduction <ul><li>Operating system: is a program that acts as an intermediary between a user and computer hardware. The goals are to make the computer system convenient to use and run in an efficient manner. </li></ul><ul><li>Why, what and how? </li></ul><ul><li>DOS, Window, UNIX, Linux </li></ul><ul><li>Single-user, multi-user </li></ul>
  3. 3. 1.1 What Is an Operating System <ul><li>OS=government: resource allocation=> CPU, memory, IO, storage </li></ul><ul><li>OS: a control program controls the execution of user programs to prevent errors and improper use of the computer. </li></ul><ul><li>Convenience for the user and efficient operation of the computer system </li></ul>User System and application programs Operating system User Hardware
  4. 4. 1.2 Mainframe Systems <ul><li>Batch systems </li></ul><ul><li>Multiprogrammed systems </li></ul><ul><li>Time-sharing systems </li></ul>
  5. 5. Batch Systems <ul><li>In early days (beyond PC era), computers were extremely expensive. Only few institutes can afford it. </li></ul><ul><li>The common IO devices include card readers, tape drives, and line printers. </li></ul><ul><li>To speed up processing, operators batched together jobs with similar needs and ran them through the computer as a group. </li></ul><ul><li>The OS is simple that needs to only automatically transfer control from one job to the next. </li></ul>
  6. 6. Batch Systems <ul><li>Speed(CPU) >> speed(IO: card readers) => CPU is constantly idle. </li></ul><ul><li>After introduce disk technology, OS can keep all jobs on a disk instead of a serial card reader. OS can perform job scheduling (Ch. 6) to perform tasks more efficiently. </li></ul>
  7. 7. Multiprogrammed Systems <ul><li>Multiprogramming: OS keeps several jobs in the memory simultaneously. Interleaving CPU and IO operations between different jobs to maximize the CPU utilization. </li></ul><ul><li>Life examples: a lawyer handles multiple cases for many clients. </li></ul><ul><li>Multiprogramming is the first instance where OS must make decisions for the users: job scheduling and CPU scheduling. </li></ul>
  8. 8. Time-Sharing Systems <ul><li>Time sharing or multitasking: the CPU executes multiple jobs by switching among them, but switches are so quick and so frequently that the users can interact with each program while it is running (the user thinks that he/she is the only user). </li></ul><ul><li>A time-sharing OS uses CPU scheduling and multiprogramming to provide each user with a small portion of a time-shared computer. </li></ul><ul><li>Process: a program is loaded into memory and executed. </li></ul>
  9. 9. Time-Sharing Systems <ul><li>Need memory management and protection methods (Ch. 9) </li></ul><ul><li>Virtual memory (Ch. 10) </li></ul><ul><li>File systems (Ch. 11) </li></ul><ul><li>Disk management (Ch. 13) </li></ul><ul><li>CPU scheduling (Ch. 6) </li></ul><ul><li>Synchronization and communication (Ch. 7) </li></ul>
  10. 10. 1.3 Desktop Systems <ul><li>MS-DOS, Microsoft-Window, Linux, IBM OS/2, Macintosh OS </li></ul><ul><li>Mainframe (MULTICS:MIT) => minicomputers (DEC:VMS, Bell-Lab:UNIX) => microcomputers => network computers </li></ul><ul><li>Personal workstation: a large PC (SUN, HP, IBM: Windows NY, UNIX) </li></ul><ul><li>PCs are mainly single-user systems: no resource sharing is needed; due to the internet access, security and protection is needed </li></ul><ul><li>Worm or virus </li></ul>
  11. 11. 1.4 Multiprocessor Systems <ul><li>Multiprocessor systems: tightly coupled systems </li></ul><ul><li>Why? 1) improve throughput, 2) money saving due to resources sharing (peripherals, storage, and power), and 3) increase reliability (graceful degradation, fault tolerant) </li></ul><ul><li>Symmetric multiprocessing: each processor runs an identical OS, needs communication between processors </li></ul><ul><li>Asymmetric multiprocessing: one master control processor, master-slave </li></ul>
  12. 12. Multiprocessor Systems <ul><li>Back-ends </li></ul><ul><li>=> microprocessors become inexpensive </li></ul><ul><li>=> using additional microprocessors to off-load some OS functions (e.g., using a microprocessor system to control disk management) </li></ul><ul><li>a kind of master-salve multiprocessing </li></ul>
  13. 13. 1.5 Distributed Systems <ul><li>Network, TCP/IP, ATM protocols </li></ul><ul><li>Local-area network (LAN) </li></ul><ul><li>Wide-area network (WAN) </li></ul><ul><li>Metropolitan-area network (MAN) </li></ul><ul><li>Client-server systems (computer-server, file server) </li></ul><ul><li>Peer-to-peer systems (WWW) </li></ul><ul><li>Network operating systems </li></ul>
  14. 14. 1.6 Clustered Systems <ul><li>High availability: one can monitor one or more of the others (over the LAN). If the monitored one fails, the monitoring machine will take ownership of its storage, and restart the applications that were running on the failed machine. </li></ul><ul><li>Asymmetric and symmetric modes </li></ul>
  15. 15. 1.7 Real-Time Systems <ul><li>There are rigid time requirements on the operation of a processor or control/data flow </li></ul><ul><li>Hard real-time systems: the critical tasks must be guaranteed to be completed on time </li></ul><ul><li>Soft real-time systems: a critical real-time task gets priority over other tasks </li></ul>
  16. 16. 1.8 Handheld Systems <ul><li>PDAs (personal digital assistants) - palm-Pilots and cellular phones. </li></ul><ul><li>Considerations: small memory size, slow processor speed, and low power consumption. </li></ul><ul><li>Web clipping </li></ul>
  17. 17. 19. Feature Migration <ul><li>MULTICS (MULTIplexed Information and Computing Services) operating system: MIT -> GE645 </li></ul><ul><li>UNIX: Bell Lab -> PDP11 </li></ul><ul><li>Microsoft Windows NT, IBM OS/2, Macintosh OS </li></ul>
  18. 18. 1.10 Computing Environments <ul><li>Traditional computing: network, firewalls </li></ul><ul><li>Web-based computing </li></ul><ul><li>Embedded computing </li></ul>
  19. 19. Ch. 2 Computer-System Structures CPU Disk controller Printer controller Tape-drive controller Memory controller Memory Disks Printers Tape drivers System bus
  20. 20. 2.1 Computer-System Operation <ul><li>Bootstrap program </li></ul><ul><li>Modern OSs are interrupt driven </li></ul><ul><li>Interrupt vector: interrupted device address, interrupt request, and other info </li></ul><ul><li>System call (e.g., performing an I/O operation) </li></ul><ul><li>Trap </li></ul>
  21. 21. 2.2 I/O Structure <ul><li>SCSI (small computer-systems interface): can attach seven or more devices </li></ul><ul><li>Synchronous I/O: I/O requested => I/O started => I/O completed => returned control to user program </li></ul><ul><li>Asynchronous I/O: I/O requested => I/O started => returned control to user program without waiting the completion of the I/O operation </li></ul><ul><li>Device-status table: indicates the device’s type, address, and state (busy, idle, not functioning) </li></ul>
  22. 22. I/O Structure <ul><li>DMA (Direct Memory Access) </li></ul><ul><li>Data transfer for high-speed I/O devices and main memory </li></ul><ul><li>Block transfer with one interrupt (without CPU intervention: 1 byte/word at a time) </li></ul><ul><li>Cycle-stealing </li></ul><ul><li>A back-end microprocessor? </li></ul>
  23. 23. 2.3 Storage Structure <ul><li>Main memory: RAM (SRAM and DRAM) </li></ul><ul><li>von Neumann architecture: instruction register </li></ul><ul><li>Memory-mapped I/O, programmed I/O (PIO) </li></ul><ul><li>Secondary memory </li></ul><ul><li>Magnetic disks, floppy disks </li></ul><ul><li>Magnetic tapes </li></ul>
  24. 24. 2.4 Storage Hierarchy <ul><li>Bridging speed gap </li></ul><ul><li>registers=>cache=>main memory=>electronic disk=>magnetic disk=>optical disk=>magnetic tapes </li></ul><ul><li>Volatile storage: data lost when power is off </li></ul><ul><li>Nonvolatile storage: storage systems below electronic disk are nonvolatile </li></ul><ul><li>Cache: small size but fast (cache management: hit and miss) </li></ul><ul><li>Coherency and consistency </li></ul>(FIG)
  25. 25. 2.5 Hardware Protection <ul><li>Resource sharing (multiprogramming) improves utilization but also increase problems </li></ul><ul><li>Many programming errors are detected by the hardware and reported to OS (e.g., memory fault) </li></ul><ul><li>Dual-mode operation: user mode and monitor mode (also called supervisor, system or privileged mode: privileged instructions): indicated by a mode bit. </li></ul><ul><li>Whenever a trap occurs, the hardware switches from user mode to monitor mode </li></ul>
  26. 26. Hardware Protection <ul><li>I/O protection: all I/O instructions should be privileged instructions. The user can only perform I/O operation through the OS. </li></ul><ul><li>Memory protection: protect the OS from access by users program, protect user programs from each other: base and limit registers. </li></ul><ul><li>CPU protection: A timer to prevent a user program from getting stuck in an infinite loop. </li></ul>
  27. 27. 2.6 Network Structure <ul><li>LAN: cover a small geographical area, twisted pair and fiber optic cabling, high speed, Ethernet. </li></ul><ul><li>WAN: Arparnet (academia research) , router, modems. </li></ul>
  28. 28. CH. 3 OS Structure <ul><li>Examining the services that an OS provides </li></ul><ul><li>Examining the interface between the OS and users </li></ul><ul><li>Disassembling the system into components and their interconnections </li></ul><ul><li>OS components: </li></ul><ul><li>=> Process management </li></ul><ul><li>=> Main-memory management </li></ul><ul><li>=> File management </li></ul><ul><li>=> I/O-system management </li></ul><ul><li>=> Secondary-storage management </li></ul><ul><li>=> Networking </li></ul><ul><li>=> Protection system </li></ul><ul><li>=> Command-interpreter </li></ul>
  29. 29. 3.1 System Components Process Management <ul><li>Process: a program in execution (e.g., a compiler, a word-processing program) </li></ul><ul><li>A process needs certain resources (e.g., CPU, memory, files and I/O devices) to complete its task. When the process terminates, the OS will reclaim any reusable resources. </li></ul><ul><li>OS processes and user processes: The execution of each process must be sequential. All the processes can potentially execute concurrently, by multiplexing the CPU among them. </li></ul>
  30. 30. Process Management <ul><li>The OS should perform the following tasks: </li></ul><ul><li>Creating and deleting processes </li></ul><ul><li>Suspending and resuming processes </li></ul><ul><li>Providing mechanisms for process synchronization </li></ul><ul><li>Providing mechanisms for process communication </li></ul><ul><li>Providing mechanisms for deadlock handling </li></ul><ul><li>=> Ch. 4- Ch. 7 </li></ul>
  31. 31. Main-Memory Management <ul><li>Main memory is a repository of quickly accessible data shared by the CPU and I/O devices (Store data as well as program) </li></ul><ul><li>Using absolute address to access data in the main memory </li></ul><ul><li>Each memory-management scheme requires its own hardware support </li></ul><ul><li>The OS should responsible for the following tasks: </li></ul><ul><li>=> Tracking what parts memory are currently used and by whom </li></ul><ul><li>=> Deciding which processes should be loaded into memory </li></ul><ul><li>=> Allocating and deallocating memory as needed </li></ul>
  32. 32. File Management <ul><li>Different I/O devices have different characteristics (e.g., access speed, capacity, access method) - physical properties </li></ul><ul><li>File: is a collection of related information defined by its creator. The OS provides a logical view of information storage (FILE) regardless its physical properties </li></ul><ul><li>Directories => files (organizer) => access right for multiple users </li></ul>
  33. 33. File Management <ul><li>The OS should be responsible for: </li></ul><ul><li>Creating and deleting files </li></ul><ul><li>Creating and deleting directories </li></ul><ul><li>Supporting primitives for manipulating files and directories </li></ul><ul><li>Mapping files onto secondary storage </li></ul><ul><li>Backing up files on nonvolatile storage </li></ul><ul><li>=> Ch. 11 </li></ul>
  34. 34. I/O-System Management <ul><li>An OS should hide the peculiarities of specific hardware devices from the user </li></ul><ul><li>The I/O subsystem consists of: </li></ul><ul><li>A memory-management component including buffering, caching, and spooling </li></ul><ul><li>A general device-driver interface </li></ul><ul><li>Drivers for specific hardware devices </li></ul>
  35. 35. Secondary-Storage Management <ul><li>Most modern computer systems use disks as the principle on-line storage medium, for both programs and data </li></ul><ul><li>Most programs stored on a disk and will be loaded into main memory whenever it is needed </li></ul><ul><li>The OS should responsible for: </li></ul><ul><li>=> Free-space management </li></ul><ul><li>=> Storage allocation </li></ul><ul><li>=> Disk scheduling </li></ul><ul><li>=> Ch. 13 </li></ul>
  36. 36. Networking <ul><li>Distributed system: a collection of independent processors that are connected through a communication network </li></ul><ul><li>FTP: file transfer protocol </li></ul><ul><li>WWW: NFS (network file system protocol) </li></ul><ul><li>http: </li></ul><ul><li>=> Ch. 14- Ch. 17 </li></ul>
  37. 37. Protection System <ul><li>For a multi-user/multi-process system: processes executions need to be protected </li></ul><ul><li>Any mechanisms for controlling the access of programs, data, and resources </li></ul><ul><li>Authorized and unauthorized access and usage </li></ul>
  38. 38. Command-Interpreter System <ul><li>OS (kernel) <=> command interpreter (shell) <=> user </li></ul><ul><li>Control statements </li></ul><ul><li>A mouse-based window OS: </li></ul><ul><li>Click an icon, depending on mouse point’s location, the OS can invoke a program, select a file or a directory ( folder ). </li></ul>
  39. 39. 3.2 OS Services <ul><li>Program execution </li></ul><ul><li>I/O operation </li></ul><ul><li>File-system manipulation </li></ul><ul><li>Communications </li></ul><ul><li>Error detection </li></ul><ul><li>Resource allocation </li></ul><ul><li>Accounting </li></ul><ul><li>Protection </li></ul>
  40. 40. 3.3 System Calls <ul><li>System calls: the interface between a process and the OS </li></ul><ul><li>Mainly in assembly-language instructions. </li></ul><ul><li>Allow to be invoked from a higher-level language program (C, C++ for UNIX: JAVA+C/C++) </li></ul><ul><li>EX. Copy one file to another: how to use system calls to perform this task? </li></ul><ul><li>Three common ways to pass parameters to the OS: register, block, stack (push/pop). </li></ul>
  41. 41. System Calls <ul><li>Five major categories: </li></ul><ul><li>Process control </li></ul><ul><li>File manipulation </li></ul><ul><li>Device manipulation </li></ul><ul><li>Information maintenance </li></ul><ul><li>Communications </li></ul>
  42. 42. Process Control <ul><li>End, about: </li></ul><ul><li>=>Halt the execution normally (end) or abnormally (abort) </li></ul><ul><li>=> Core dump file: debugger </li></ul><ul><li>=>Error level and possible recovery </li></ul><ul><li>Load, execute </li></ul><ul><li>=> When to load/execute? Where to return the control after it’s done? </li></ul><ul><li>Create/terminate process </li></ul><ul><li>=> When? (wait time/event) </li></ul>
  43. 43. Process Control <ul><li>Get/set process attributes </li></ul><ul><li>=> Core dump file for debugging </li></ul><ul><li>=> A time profile of a program </li></ul><ul><li>Wait for time, event, single event </li></ul><ul><li>Allocate and free memory </li></ul><ul><li>The MS-DOS: a single tasking system </li></ul><ul><li>Berkeley UNIX: a multitasking system (using fork to start a new process </li></ul>
  44. 44. File Management <ul><li>Create/delete file </li></ul><ul><li>Open, close </li></ul><ul><li>Read, write, reposition (e.g., to the end of the file) </li></ul><ul><li>Get/set file attributes </li></ul>
  45. 45. Device Management <ul><li>Request/release device </li></ul><ul><li>Read, write, reposition </li></ul><ul><li>Get/set device attributes </li></ul><ul><li>Logically attach and detach devices </li></ul>
  46. 46. Information Maintenance <ul><li>Get/set time or date </li></ul><ul><li>Get/set system data (e.g., OS version, free memory space) </li></ul><ul><li>Get/set process, file, or device attributes (e.g., current users and processes) </li></ul>
  47. 47. Communications <ul><li>Create, delete communication connection: message-passing and shared-memory model </li></ul><ul><li>Send, receive messages: host name (IP name), process name </li></ul><ul><li>Daemons: source (client)<->connection<->the receiving daemon (server) </li></ul><ul><li>Transfer status information </li></ul><ul><li>Attach or detach remote devices </li></ul>
  48. 48. 3.4 System Programs <ul><li>OS: a collection of system programs include file management, status information, file modification, programming-language support, program loading and execution, and communications. </li></ul><ul><li>Os is supplied with system utilities or application programs (e.g., web browsers, compiler, word-processors) </li></ul><ul><li>Command interpreter: the most important system program </li></ul><ul><li>=> contains code to execute the command </li></ul><ul><li>=> UNIX: command -> to a file, load the file into memory and execute </li></ul><ul><li>rm G => search the file rm => load the file => execute it with the parameter G </li></ul>
  49. 49. 3.5 System Structure (Simple Structure) <ul><li>MS-DOS: application programs are able to directly access the basic I/O routine (8088 has no dual mode and no hardware protection) => errant programs may cause entire system crashes </li></ul><ul><li>UNIX: the kernel and the system programs. </li></ul><ul><li>System calls define the application programmer interface (API) to UNIX </li></ul>FIG3.6 FIG3.7
  50. 50. Layered Approach <ul><li>Layer 0 (the bottom one): the hardware, layer N (the top one): the user interface </li></ul><ul><li>The main advantage of the layer approach: modularity </li></ul><ul><li>Pro: simplify the design and implementation </li></ul><ul><li>Con: not easy to appropriately define the layers </li></ul><ul><li> less efficient </li></ul><ul><li>Windows NT: a highly layer-oriented organization => lower performance compared to Windows 95 => Windows NT 4.0 => moving layers from user space to kernel space to improve the performance </li></ul>
  51. 51. Microkernels <ul><li>Carnegie Mellon Univ (1980s): Mach </li></ul><ul><li>Idea: removing all nonessential components from the kernel, and implementing them as system and user-level programs. </li></ul><ul><li>Main function: microkernel provides a communication facility (message passing) between the client program and various services (running in user space) </li></ul><ul><li>Easy of extending the OS: new services are added to the user space, no change on the kernel </li></ul>
  52. 52. Microkernels <ul><li>Easy to port, more security and reliability (most services are running as user, if a service fails, the rest of OS remains ok) </li></ul><ul><li>Digital UNIX </li></ul><ul><li>Apple MacOS Server OS </li></ul><ul><li>Windows NT: a hybrid structure </li></ul>FIG 3.10
  53. 53. Virtual Machines <ul><li>VM: IBM </li></ul><ul><li>Each process is provided with a (virtual) copy of the underlying computer </li></ul><ul><li>Major difficulty: disk systems => minidisks </li></ul><ul><li>Implementation: </li></ul><ul><li>Difficult to implement: switch between a virtual user and a virtual monitor mode </li></ul><ul><li>Less efficient in run time </li></ul>FIG 3.11
  54. 54. Virtual Machines <ul><li>Benefits: </li></ul><ul><li>The environment is complete protection of the various system resources (but no direct sharing of resources) </li></ul><ul><li>A perfect vehicle for OS research and development </li></ul><ul><li>No system-development time is needed: system programmer can work on his/her own virtual machine to develop their system </li></ul><ul><li>MS-DOS (Intel) <=> UNIX (SUN) </li></ul><ul><li>Apple Macintosh (68000) <=> Mac (old 68000) </li></ul><ul><li>Java </li></ul>
  55. 55. Java <ul><li>Java: a technology rather than a programming language : SUN : late 1995 </li></ul><ul><li>Three essential components: </li></ul><ul><li>=> Programming-language specification </li></ul><ul><li>=> Application-programming interface (API) </li></ul><ul><li>=> Virtual-machine specification </li></ul>
  56. 56. Java <ul><li>Programming language </li></ul><ul><li>Object-oriented, architecture-neutral, distributed and multithreaded programming language </li></ul><ul><li>Applets: programs with limited resource access that run within a web browser </li></ul><ul><li>A secure language (running on distributed network) </li></ul><ul><li>Performing automate garbage collection </li></ul>
  57. 57. Java <ul><li>API </li></ul><ul><li>Basic language: support for graphics, I/O, utilities and networking </li></ul><ul><li>Extended language: support for enterprise, commerce, security and media </li></ul><ul><li>Virtual machine </li></ul><ul><li>JVM: a class loader and a Java interpreter </li></ul><ul><li>Just-in-time compiler: turns the architecture-neutral bytecodes into native machine language for the host computer </li></ul>
  58. 58. Java <ul><li>The Java platforms: JVM and Java API => make it possible to develop programs that are architecture neutral and portable </li></ul><ul><li>Java development environment: a compile-time and a run-time environment </li></ul>
  59. 59. 3.8 System Design and Implementation <ul><li>Define the goals and specification </li></ul><ul><li>User goals (wish list) and system goals (implementation concerns) </li></ul><ul><li>The separation of policy (what should be done) and mechanism (how to do it) </li></ul><ul><li>Microkernel: implementing a basic set of policy-free primitive building blocks </li></ul><ul><li>Traditionally, OS is implemented using assembly language (better performance but portable is the problem) </li></ul>
  60. 60. System Design and Implementation <ul><li>High-level language implementation </li></ul><ul><li>Easy porting but slow speed with more storage </li></ul><ul><li>Need better data structures and algorithms </li></ul><ul><li>MULTICS (ALGOL); UNIX, OS/2, Windows (C) </li></ul><ul><li>Non critical (HLL), critical (assembly language) </li></ul><ul><li>System generation (SYSGEN): to create an OS for a particular machine configuration (e.g., CPU? Memory? Devices? Options?) </li></ul>
  61. 61. Part II: Process Management Ch. 4 Processes 4.1 Process Concept <ul><li>Process (job) is a program in execution </li></ul><ul><li>Ex. For a single-user system (PC), the user can run multiple processes (jobs), such as web, word-processor, and CD-player, simultaneously </li></ul><ul><li>Two processes may be associated with the same program. Ex. You can invoke an editor twice to edit two files (two processes) simultaneously </li></ul>
  62. 62. Process Concept <ul><li>Process state: </li></ul><ul><li>Each process may be in one of the 5 states: new, running, waiting, ready, and terminated </li></ul>admitted interrupt Scheduler dispatch IO or event wait IO or event completion exit New Running Waiting Ready Terminated
  63. 63. Process Concept <ul><li>Process Control Block (PCB): represents a process </li></ul><ul><li>Process state: new, ready, running, waiting or exit </li></ul><ul><li>Program counter: point to the next instruction to be executed for the process </li></ul><ul><li>CPU registers: when an interrupt occurs, the data needs to be stored to allow the process to be continued correctly </li></ul><ul><li>CPU-scheduling information: process priority (Ch.6) </li></ul><ul><li>Memory-management information: the values of base and limit registers, the page tables... </li></ul>FIG 4.2
  64. 64. Process Concept <ul><li>Accounting information: account number, process number, time limits… </li></ul><ul><li>IO status information: a list of IO devices allocated to the process, a list of open files…. </li></ul><ul><li>Threads </li></ul><ul><li>Single thread: a process is executed with one control/data flow </li></ul><ul><li>Multi-thread: a process is executed with multiple control/data flow (e.g., running an editor, a process can execute “type in” and spelling check at the same time </li></ul>FIG 4.3
  65. 65. 4.2 Process Scheduling <ul><li>The objective of multiprogramming: maximize the CPU utilization (keep the CPU running all the time) </li></ul><ul><li>Scheduling queues </li></ul><ul><li>Ready queue (usually a linked list): the processes that are in the main memory and ready to be executed </li></ul><ul><li>Device queue: the list of processes waiting for a particular IO device </li></ul>FIG 4.4
  66. 66. Process Scheduling <ul><li>Queuing diagram </li></ul>Ready queue CPU IO IO queue IO request Time slice expired Fork a child Wait for an interrupt Child executes Interrupt occurs
  67. 67. Process Scheduling <ul><li>Scheduler </li></ul><ul><li>Long-term scheduler (job scheduler): selects process from a pool and loads them into main memory for execution (less frequent and has longer-time to make a more careful selection decision) </li></ul><ul><li>Short-term scheduler (CPU scheduler): selects among processes for execution (more frequent and must fast) </li></ul><ul><li>The long-term scheduler controls the degree of multiprogramming (the # of processes in memory) </li></ul>
  68. 68. Process Scheduling <ul><li>IO-bound process </li></ul><ul><li>CPU-bound process </li></ul><ul><li>if all processes are IO-bound => ready queue always be empty => short-term scheduler has nothing to do </li></ul><ul><li>if all processes are CPU-bound => IO-waiting queue always be empty => devices will be unused </li></ul><ul><li>Balance system performance = a good mix of IO-bound and CPU-bound processes </li></ul>
  69. 69. Process Scheduling <ul><li>The medium-term scheduler: using swapping to improve the process mix </li></ul><ul><li>Context switching: switching the CPU to a new process => saving the state of the suspended process AND loading the saved state for the new process </li></ul><ul><li>Context switching time is pure overhead and heavily depended on hardware support </li></ul>FIG 4.6
  70. 70. 4.3 Operations on Processes <ul><li>Process creation </li></ul><ul><li>A process may create several new processes: parent process => children processes (tree) </li></ul><ul><li>Subprocesses may obtain resources from their parent (it may overloading) or from the OS </li></ul><ul><li>When a process creates a new one, the execution </li></ul><ul><li>1. The parent and the new one run concurrently </li></ul><ul><li>2. The parent waits until all of its children have terminated </li></ul>
  71. 71. 4.3 Operations on Processes <ul><li>In terms of the address space of the new process </li></ul><ul><li>1. The child process is a duplicate of the parent process </li></ul><ul><li>2. The child process has a program loaded into it </li></ul><ul><li>In UNIX, each process has a process identifier. “fork” system call to create a new process (it consists of a copy of the address space of the original process) Advantage? Easy communication between the parent and children processes. </li></ul>
  72. 72. 4.3 Operations on Processes <ul><li>“ execlp” system call (after “fork”): replace the process’ memory space with a new program </li></ul>Pid = fork(); if (pid<0) fork failed else if (pid==0) execlp(“/bin/ls”, “ls”,NULL) --- overlay with UNIX “ls” else wait(NULL) -- wait for the child to complete printf(“Child Complete”); exit(0)
  73. 73. 4.3 Operations on Processes <ul><li>Process termination </li></ul><ul><li>“ exit”: system call after terminating a process </li></ul><ul><li>Cascading termination: when a process terminates, all its children must also be terminated </li></ul>
  74. 74. 4.4 Cooperating Processes <ul><li>Independent and cooperating processes </li></ul><ul><li>Any process shares data with other processes is a cooperating process </li></ul><ul><li>WHY needs process cooperation? </li></ul><ul><li>Information sharing </li></ul><ul><li>Computation speedup (e.g., parallel execution of CPU and IO) </li></ul><ul><li>Modularity: dividing the system functions into separate processes </li></ul>
  75. 75. 4.4 Cooperating Processes <ul><li>Convenience: for a single-user, many tasks can be executed at the same time </li></ul><ul><li>Producer-consumer </li></ul><ul><li>Unbounded/bounded-buffer </li></ul><ul><li>The shared buffer: implemented as a circular array </li></ul>
  76. 76. 4.5 Interprocess Communication (IPC) <ul><li>Message-passing system </li></ul><ul><li>“ send” and “receive” </li></ul><ul><li>Fixed or variable size of messages </li></ul><ul><li>Communication link </li></ul><ul><li>Direct/indirect communication </li></ul><ul><li>Symmetric/asymmetric communication </li></ul><ul><li>Automatic or explicit buffering </li></ul><ul><li>Send by copy or by reference </li></ul><ul><li>Fixed or variable-sized messages </li></ul>
  77. 77. 4.5 Interprocess Communication (IPC) <ul><li>Naming </li></ul><ul><li>Direct communication (two processes link) </li></ul><ul><li>symmetric in addressing: send(p, message), receive(q, message): explicit name of the recipient and sender </li></ul><ul><li>asymmetric in addressing: send(p, message), receive(id, message): variable id is set to the name </li></ul><ul><li>Disadvantage: limited modularity of the process definition (all the old names need to be found before it can be modified; not suitable for separate compilation) </li></ul>
  78. 78. 4.5 Interprocess Communication (IPC) <ul><li>Indirect communication </li></ul><ul><li>using mailboxes or ports </li></ul><ul><li>Supporting multi-processes link </li></ul><ul><li>Mailbox may be owned by the process (when process terminates, the mailbox disappears) or </li></ul><ul><li>If the mailbox is owned by the OS that must allow the process: creates a new mailbox, send/receive message via the mailbox, and deletes the mailbox </li></ul>
  79. 79. 4.5 Interprocess Communication (IPC) <ul><li>Synchronization </li></ul><ul><li>Blocking/nonblocking send and receive </li></ul><ul><li>Blocking (asynchronous) nonblocking (synchronous) </li></ul><ul><li>A rendezvous between the sender and receiver when both are blocking </li></ul><ul><li>Buffering </li></ul><ul><li>Zero/bounded/unbounded capacity </li></ul>
  80. 80. Mach <ul><li>Message based: using ports </li></ul><ul><li>When a task is created: two mailboxes, the Kernel (kernel communication) and the Notify (notification of event occurrences) ports are created </li></ul><ul><li>Three systems calls are needed for message transfer: msg_send, msd_receive, and msg_rpc (Remote Procedure Call) </li></ul><ul><li>Mailbox: initial empty queue: FIFO order </li></ul><ul><li>Message: fixed-length header, variable-length data </li></ul>
  81. 81. Mach <ul><li>If the mailbox is full, the sender has 4 options: </li></ul><ul><li>1. Wait indefinitely until there is a free room </li></ul><ul><li>2. Wait for N ms </li></ul><ul><li>3. Do not wait, just return immediately </li></ul><ul><li>4. Temporarily cache a message </li></ul><ul><li>The receiver must specify the mailbox or the mailbox set </li></ul><ul><li>The Mach was designed for distributed systems </li></ul>
  82. 82. Window NT <ul><li>Employs modularity to increase functionality and decrease the implementation time for adding new features </li></ul><ul><li>NT supports multiple OS subsystems: message passing (called local procedure-call facility (LPC)) </li></ul><ul><li>Using ports for communications: connection port (by client) and communication port (by server) </li></ul><ul><li>3 types of message-passing techniques: </li></ul><ul><li>1. 256-byte queue </li></ul><ul><li>2. Large message via shared memory </li></ul><ul><li>3. Quick LPC (64k) </li></ul>
  83. 83. 4.6 Communication in Client-Server Systems <ul><li>Socket: made up of an IP address concatenated with a port number </li></ul><ul><li>Remote procedure calls (RPC) </li></ul>
  84. 84. Ch. 5 Thread 5.1 Overview <ul><li>A lightweight process: a basic unit of CPU utilization </li></ul><ul><li>A heavyweight process: a single thread of control </li></ul><ul><li>Multithread is common practice: ex. Web has 1 thread on displaying text/image and another on retrieving data from the network </li></ul><ul><li>When a single application requires to perform several similar tasks (e.g., web server accepts many clients’ requests), using threads is more efficient than using processes. </li></ul>FIG 5.1
  85. 85. Benefits <ul><li>4 main benefits: </li></ul><ul><li>Responsiveness: allowing a program to continue running even part of it is blocked or running a lengthy operation </li></ul><ul><li>Resource sharing: memory and code </li></ul><ul><li>Economy: allocating memory and resources for a process is more expensive (in Solaris, creating a process is 30 times slower, contex switching is 5 times slower) </li></ul><ul><li>Utilization of multiprocessor architectures (for a single-processor, the thread is running one at a time </li></ul>
  86. 86. User and Kernel Threads <ul><li>User thread </li></ul><ul><li>by a thread library at the user level that supports thread creation, scheduling and management with no kernel’s support </li></ul><ul><li>Advantage: fast </li></ul><ul><li>Disadvantage: if a kernel is single-threaded, any user-level thread -> blocking system calls => block the entire process </li></ul><ul><li>POSIX Pthreads , Mach C-threads , Solaris threads </li></ul>
  87. 87. User and Kernel Threads <ul><li>Kernel threads </li></ul><ul><li>Supported by the OS </li></ul><ul><li>It’s slower than user threads </li></ul><ul><li>If a thread performs a block system call, the kernel can schedule another thread in the application for execution </li></ul><ul><li>Window NT, Solaris, Digital UNIX </li></ul>
  88. 88. 5.2 Multithreading Models <ul><li>Many-to-one model: many user-level to one kernel </li></ul><ul><li>only one user thread can access the kernel thread at one time => can’t run in parallel on multiprocessors </li></ul><ul><li>One-to-one model </li></ul><ul><li>More concurrency (allowing parallel execution) </li></ul><ul><li>Overhead: one kernel process for one user process </li></ul><ul><li>Many-to-many </li></ul><ul><li>The # of kernel threads => specific for a particular application or machine </li></ul><ul><li>it doesn’t suffer the drawbacks of the other two models </li></ul>
  89. 89. 5.3 Treading Issues <ul><li>The fork and exec system calls </li></ul><ul><li>Cancellation: asynchronous and deferred </li></ul><ul><li>Signal handling: default and user-defined </li></ul><ul><li>Thread pools </li></ul><ul><li>Thread-specific data </li></ul>
  90. 90. 5.4 Pthreads <ul><li>POSIX standard (IEEE 1003.1c): an API for thread creation and synchronization </li></ul><ul><li>A specification for thread behavior not an implementation </li></ul>
  91. 91. 5.5 Solaris Threads <ul><li>Till 1992 it only supports a single thread of control </li></ul><ul><li>Now, it supports kernel/user-level, symmetric multiprocessing, and real-time scheduling </li></ul><ul><li>Intermediate-level of threads: user-level <=>lightweight processes (LWP)<=>kernel-level </li></ul><ul><li>Many-to-many model </li></ul><ul><li>User-level threads: bounded (permanently attached to a LWP), unbounded (multiplexed onto the pool of available LWPs) </li></ul>FIG 5.6
  92. 92. Solaris Threads <ul><li>Each LWP is connected to one kernel-level thread, whereas each user-level thread is independent of the kernel </li></ul>
  93. 93. 5.6-8 Other Threads <ul><li>Window 2000 </li></ul><ul><li>Linux </li></ul><ul><li>Java </li></ul>
  94. 94. Ch. 6 CPU Scheduling 6.1 Basic Concepts <ul><li>The objective of multiprogramming: maximize the CPU utilization </li></ul><ul><li>Scheduling: the center of OS </li></ul><ul><li>CPU-IO burst cycle: IO-bound program->many short CPU bursts, CPU-bound program->few very long CPU bursts </li></ul><ul><li>CPU scheduler: short-term scheduler </li></ul><ul><li>Queue: FIFO, priority, tree or a linked list </li></ul><ul><li>Preemptive scheduling </li></ul><ul><li>CPU scheduling decisions depend on: </li></ul>
  95. 95. Basic Concepts <ul><li>1. A process from running to waiting state </li></ul><ul><li>2. A process from running to ready state </li></ul><ul><li>3. A process from waiting to ready state </li></ul><ul><li>4. A process terminates </li></ul><ul><li>1 and 4 occur, a new process must be selected for execution but not necessary for 2 and 3 </li></ul><ul><li>The scheduling scheme only for 1 and 4 is called nonpreemptive or cooperative (once the CPU is allocated to a process, the process keeps the CPU till it terminates or moves to the waiting state </li></ul>
  96. 96. Basic Concepts <ul><li>The preemptive scheduling scheme needs to consider how to swap the process execution and maintain the correct execution (Context switching) </li></ul><ul><li>Dispatcher: gives control of the CPU to a newly selected process </li></ul><ul><li>Switching context </li></ul><ul><li>Switching to user mode </li></ul><ul><li>Jump to proper location of the user program and start it </li></ul><ul><li>Dispatch latency: the time between stop the old and start the new one </li></ul>
  97. 97. 6.2 Scheduling Criteria <ul><li>CPU utilization </li></ul><ul><li>Throughput: the # of processes completed/per unit-time </li></ul><ul><li>Turnaround time: submission of a process to its completion </li></ul><ul><li>Waiting time: the sum of the periods spend waiting in the ready queue </li></ul><ul><li>Response time: interactive system (minimize variance of the response time is more important than minimize the average response time) </li></ul>
  98. 98. 6.3 Scheduling Algorithms <ul><li>Comparison the average waiting time </li></ul><ul><li>FCFS(first come first serve) </li></ul><ul><li>Convoy effect: all other processes wait for one big process gets off the CPU </li></ul><ul><li>The FCFS scheduling algorithm is nonpreeemptive </li></ul>
  99. 99. Scheduling Algorithms <ul><li>SJF(Shortest-job-first scheduling) </li></ul><ul><li>Provably optimal </li></ul><ul><li>Difficulty: how to know the length of the next CPU burst??? </li></ul><ul><li>Used frequently in long-term scheduling </li></ul>
  100. 100. Scheduling Algorithms <ul><li>Predict: exponential average </li></ul><ul><li>Preemptive SJF: shortest-remaining-time-first </li></ul>
  101. 101. Scheduling Algorithms <ul><li>Priority scheduling </li></ul><ul><li>Priorities can be defined internally (some measures in time or memory size) or externally (specify by the users) </li></ul><ul><li>Either preemptive or nonpreemptive </li></ul><ul><li>Problem: starvation (low-priority process will never be executed) </li></ul><ul><li>Solution: aging (increase priority over time) </li></ul>
  102. 102. Scheduling Algorithms <ul><li>Round-robin (RR) scheduling </li></ul><ul><li>Suitable for time-sharing systems </li></ul><ul><li>Time quantum: circular queue of processes </li></ul><ul><li>The average waiting time is often long </li></ul><ul><li>The RR scheduling algorithm is preemptive </li></ul>
  103. 103. Scheduling Algorithms <ul><li>Performance => size of the time quantum=> extremely large (=FCFS) => extremely small (processor sharing) </li></ul><ul><li>Rule of thumb=> 80% of CPU bursts should be shorter then the time quantum </li></ul><ul><li>Performance => context switch effect => time quantum > time(context switching) </li></ul><ul><li>Turnaround time => size of the time quantum </li></ul>
  104. 104. Scheduling Algorithms <ul><li>Multilevel queue scheduling </li></ul><ul><li>Priority: foreground (interactive) processes > background (batch) processes </li></ul><ul><li>Partitions the ready queue into several separate queues </li></ul><ul><li>The processes are permanently assigned to a queue based on some properties of the process (e.g., process type, memory size…) </li></ul><ul><li>Each queue has its own scheduling algorithm </li></ul><ul><li>Scheduling between queues: 1) fixed-priority preemptive scheduling, 2) time slices between queues </li></ul>FIG 6.6
  105. 105. Scheduling Algorithms <ul><li>Multilevel feedback-queue scheduling </li></ul><ul><li>Allow a process to move between queues </li></ul><ul><li>The idea is to separate processes with different CPU-burst characteristics (e.g., move the process using too much CPU to a lower-priority) </li></ul><ul><li>What are considerations for such decisions? </li></ul>
  106. 106. 6.4 Multiple-Processor Scheduling <ul><li>Homogeneous: all processors are identical </li></ul><ul><li>Load sharing among processors </li></ul><ul><li>Symmetric multiprocessing (SMP): each processor is self-scheduling, it examines a common ready queue and select a process to execute (what’re the main concern?) </li></ul><ul><li>Asymmetric multiprocessing: a master server is handling all scheduling decisions </li></ul>
  107. 107. 6.5 Real-Time Scheduling <ul><li>Hard real-time: resource reservation (impossible using a secondary memory or virtual memory) </li></ul><ul><li>It requires a special-purpose software running on hardware dedicated to the critical process to satisfy the hard real-time constraints </li></ul><ul><li>Soft real-time: guarantee critical processes having higher priorities </li></ul><ul><li>The system must have priority scheduling and the real-time processes must have the highest priority, and will not degrade with time </li></ul><ul><li>The dispatch latency must be short. HOW? </li></ul>
  108. 108. Real-Time Scheduling <ul><li>Preemption points in long-duration system calls </li></ul><ul><li>Making the entire kernel preemptible </li></ul><ul><li>What if a high-priority process needs to read/modify kernel data which is currently used by a low-priority process? (Priority inversion) </li></ul><ul><li>Priority-inheritance protocol: the processes that are accessing resources that the high-priority process needs will inherit the high-priority and continue running till they all complete </li></ul>
  109. 109. 6.6 Algorithm Evaluation <ul><li>Deterministic modeling: analytic evaluation (given predetermined workloads and based on that to define the performance of each algorithm) </li></ul><ul><li>Queueing models: limit theoretical analysis </li></ul><ul><li>Simulations: random-number generator, it may be inaccurate due to assumed distribution (defined empirically or mathematically). Solution: trace tapes (monitoring the real system) </li></ul><ul><li>Implementation: most accurate but with high cost. </li></ul>
  110. 110. Ch. 7 Process Synchronization 7.1 Background <ul><li>Why? </li></ul><ul><li>Threads: share a logical address space </li></ul><ul><li>Processes: share data and codes </li></ul><ul><li>They have to wait in line till their turns </li></ul><ul><li>Race condition </li></ul>
  111. 111. 7.2 Critical-Section Problem <ul><li>Critical section: a thread has a segment of code in which the thread may change the common data </li></ul><ul><li>A solution to the critical-section problem must satisfy: </li></ul><ul><li>Mutual exclusion </li></ul><ul><li>Progress </li></ul><ul><li>Bounded waiting </li></ul>
  112. 112. Two-Tasks Solutions Alg 1: using a “turn” What’s the problem? What if “turn=0” and T0 is in the non-critical section, T1 needs to enter the critical section? Progress requirement? Critical section T0 T1 HOW? T0 Turn=0? CS Turn=1 T F T1 Turn=1? CS Turn=0 T F
  113. 113. Two-Tasks Solutions Alg 1: using a “turn” and “yield()” What’s the problem? It does not retain sufficient info about the state of each thread (only the thread is allowed to enter the CS). How to solve this problem? T1 Turn=1? CS Turn=0 T F Does it need to enter CS? Yield() turn=0 T F T0 Turn=0? CS Turn=1 T F Does it need to enter CS? Yield() turn=1 T F
  114. 114. Two-Tasks Solutions Alg 2: using an array to replace “turn” a0 a1->”1” indicates that T1 is ready to enter the CS Is mutual exclusion satisfied? Yes Is progress satisfied? No What if both T0 and T1 set their flag a0 and a1 to “1” at the same time? Loop forever!!! T1 CS a1=0 T F a1=1 a0=0? T0 CS a0=0 T F a0=1 a1=0?
  115. 115. Two-Tasks Solutions Alg 3: satisfying the three requirements T0 CS a0=0 T F a0=1 a1=1&&turn=1 Turn=1 T0 CS a1=0 T F a1=1 a0=1&&turn=0 Turn=0
  116. 116. 7.3 Synchronization Hardware <ul><li>Test-and-set: indivisible instructions. If two Test-and-Set instructions are executed simultaneously, they will be executed sequentially in some arbitrary order (flag and turn) </li></ul><ul><li>Swap instruction (yield()) </li></ul>
  117. 117. 7.4 Semaphores <ul><li>A general method to handle binary or multiple-parties synchronization </li></ul><ul><li>Two operations: P: test and V: increment: must be executed indivisibly </li></ul><ul><li>P(S){ while S<=0; S--} </li></ul><ul><li>V(S){ S++} </li></ul><ul><li>Binary semaphore: 0 and 1 </li></ul><ul><li>Counting semaphore: resource allocation </li></ul>
  118. 118. Semaphores <ul><li>Busy waiting: wasting CPU resources </li></ul><ul><li>Spinlock (semaphore): no context switching is required when the process is waiting on a lock </li></ul><ul><li>One solution: a process executes P operation => semaphore-value<0 => block itself rather than busying waiting </li></ul><ul><li>Wakeup operation: wait state => ready state </li></ul><ul><li>P(S){ value--; if (value<0){ add this process to a list; block }} </li></ul><ul><li>V(S){ value++; if(value<=0){remove a process P from list; wakeup (P);}} </li></ul>
  119. 119. Semaphores <ul><li>If the semaphore value is negative, the value indicates the # of processes waiting on the semaphore </li></ul><ul><li>The waiting can be implemented by: linked list, a FIFO queue (ensure bounded waiting), or??? </li></ul><ul><li>The semaphore should be treated as a critical section: </li></ul><ul><li>1. Uniprocessor: inhibited interrupt </li></ul><ul><li>2. Multiprocessor: alg 3 (SW) or hardware instructions </li></ul>
  120. 120. Semaphores <ul><li>Deadlock </li></ul><ul><li>Indefinite blocking or starvation </li></ul>P0 p(s) p(q) . . v(s) v(q) P1 p(q) p(s) . . v(q) v(s) Wait for v(s) from P0 Wait for v(q) from P1 Deadlock
  121. 121. 7.5 Classical Synchronization Problems <ul><li>The bounded-buffer problem </li></ul><ul><li>The readers-writers problem: read-write conflict in database </li></ul><ul><li>The dining-philosophers problem </li></ul>Homework exercises!!!
  122. 122. 7.6 Critical Regions <ul><li>Signal(mutex);..CS..wait(mutex)? </li></ul><ul><li>Wait(mutex);..CS..wait(mutex)? </li></ul><ul><li>V:shared T: </li></ul><ul><li>region V when B(true) S(s1); => while statement S is being executed, no other process can access the variable V </li></ul>
  123. 123. 7.7 Monitors <ul><li>Programming mistakes will cause malfunction of semaphore </li></ul><ul><li>mutex.V(); </li></ul><ul><li>criticalsection(); ==> several processes may be executing in their CS </li></ul><ul><li>mutex.P(); simultaneously! </li></ul><ul><li>mutex.P(); </li></ul><ul><li>CS(); ==> deadlock will occur </li></ul><ul><li>mutex.P(); </li></ul><ul><li>If a process misses P(), V() or both, mutual exclusion is violated or a deadlock will occur </li></ul>
  124. 124. Monitors <ul><li>A monitor: a set of programmer-defined operations that are provided mutual exclusion within the monitor (the monitor construct prohibits concurrent access to all procedures defined within the monitor) </li></ul><ul><li>Type of condition : x.wait and x.signal </li></ul><ul><li>Signal-and-Wait: P=>wait Q to leave the monitor or another condition </li></ul><ul><li>Signal-and-Continue: Q=>wait P to leave the monitor or other condition </li></ul>P x.signal Q(suspended) associated with condition x resume
  125. 125. Ch. 8 Deadlocks 8.1 System Model <ul><li>Resources: types (e.g., printers, memory), instances (e.g., 5 printers) </li></ul><ul><li>A process: must request a resource before using it and must release it after using it (i.e., request => use => release) </li></ul><ul><li>request/release device, open/close file, allocate/free memory </li></ul><ul><li>What cause deadlock? </li></ul>
  126. 126. 8.2 Deadlock Characterization <ul><li>Necessary conditions: </li></ul><ul><li>1. Mutual exclusion </li></ul><ul><li>2. Hold-and-wait </li></ul><ul><li>3. No preemption </li></ul><ul><li>4. Circular wait </li></ul><ul><li>Resource-allocation graph </li></ul><ul><li>Request edge: P->R </li></ul><ul><li>Assignment edge: R->P </li></ul>R2 R3 R1 P1 P3 P2
  127. 127. Deadlock Characterization <ul><li>If each resource has only one instance, then a cycle implies that a deadlock has occurred </li></ul><ul><li>If each resource has several instances, a cycle may not imply a deadlock (a cycle is a necessary but not a sufficient condition) </li></ul>P1->R1->P2->R3->P3->R2->P1 P1, P2, P3 deadlock P1->R1->P3->R2->P1 No deadlock, why? R2 R1 P1 P3 P2 P4 R2 R3 R1 P1 P3 P2
  128. 128. 8.3 Methods for Handling Deadlocks <ul><li>Deadlock prevention </li></ul><ul><li>Deadlock avoidance (deadlock detection) </li></ul><ul><li>Deadlock recovery </li></ul><ul><li>Do nothing: UNIX, JVM (leave to programmer) </li></ul><ul><li>Deadlocks occur very infrequently (once a year?). It’s cheaper to do nothing than implement deadlock prevention, avoidance, recovery </li></ul>
  129. 129. 8.4 Deadlock Prevention <ul><li>Make sure the four conditions will not occur simultaneously </li></ul><ul><li>Mutual exclusion: must hold for nonsharable resources </li></ul><ul><li>Hold-and-wait: guarantee a process requests a resource, it does not hold any other resources (low resource utilization and may be starvation) </li></ul><ul><li>No preemption:preempted resources of a process which requests a resource but can’t get it </li></ul><ul><li>Circular wait: impose a total ordering of all resource type, and processes request resources in an increasing order. WHY??? </li></ul>
  130. 130. 8.5 Deadlock Avoidance <ul><li>Claim edge: declare the number of resources it may need before request them </li></ul><ul><li>The OS will grant the resources to a requested process IF there has no potential deadlock (safe state) </li></ul>Unsafe if assign R2->P2: a cycle R1 R2 P1 P2 Claim edge R1 R2 P1 P2
  131. 131. 8.6 Deadlock Detection <ul><li>Wait-for-graph </li></ul><ul><li>Detect a cycle: O(n^2) => expensive </li></ul>R2 R3 R1 P1 P3 P2 P1 P3 P2
  132. 132. 8.7 Recovery from Deadlock <ul><li>Process termination: </li></ul><ul><li>Abort all deadlocked processes (a great expense) </li></ul><ul><li>Abort one process at a time until the deadlock cycle is eliminated </li></ul><ul><li>Resource preemption </li></ul><ul><li>Selection of a victim </li></ul><ul><li>Rollback </li></ul><ul><li>Starvation </li></ul>
  133. 133. Ch. 9 Memory Management 9.1 Background <ul><li>Address binding: map logical address to physical address </li></ul><ul><li>Compile time </li></ul><ul><li>Load time </li></ul><ul><li>Execution time </li></ul>FIG 9.1
  134. 134. Background <ul><li>Virtual address: logical address space </li></ul><ul><li>Memory-management unit (MMU): a hardware unit to perform run-time mapping from virtual to physical addresses </li></ul><ul><li>Relocation register -- FIG 9.2 </li></ul><ul><li>Dynamic loading: a routine is not loaded until it is called (efficient memory usage) </li></ul><ul><li>Static linking and dynamic linking (shared libraries) </li></ul>
  135. 135. Background <ul><li>Overlays: keep in memory only the instructions and data that are needed at any given time </li></ul><ul><li>Assume 1) only 150k memory 2) pass1 and pass2 don’t need to be in the memory at the same time </li></ul><ul><li>1. Pass1: 70k </li></ul><ul><li>2. Pass2: 80k </li></ul><ul><li>3. Symbol table: 20k </li></ul><ul><li>4. Common routines: 30k </li></ul><ul><li>5. Overlay driver: 10k </li></ul><ul><li>1+2+3+4+5=210k > 150k </li></ul><ul><li>Overlay1: 1+3+4+5=130k; overlay2: 2+3+4+5=140k < 150k </li></ul><ul><li>( FIG9.3 ) </li></ul>
  136. 136. 9.2 Swapping <ul><li>Swapping: memory<=>backing store (fast disks) ( FIG9.4 ) </li></ul><ul><li>The main part of swap time is transfer time: proportional to the amount of memory swapped (1M ~ 200ms) </li></ul><ul><li>Constraint on swapping: the process must completely idle especially no pending IO </li></ul><ul><li>Swapping is too long: standard swapping method is used in few systems </li></ul>
  137. 137. 9.3 Contiguous Memory Allocation <ul><li>Memory: 2 partitions: system (OS) and users’ processes </li></ul><ul><li>Memory protection:OS/processes, users processes ( FIG9.5 ) </li></ul><ul><li>Simplest method: divide the memory into a number of fixed-sized partitions. The OS keeps a table indicating which parts of memory are available and which parts are occupied </li></ul><ul><li>Dynamic storage allocation: first fit (generally fast), best fit, and worst fit </li></ul>
  138. 138. Contiguous Memory Allocation <ul><li>External fragmentation: statistical analysis on first fit shows that given N blocks, 0.5N blocks will be lost due to fragmentation (50-percent rule) </li></ul><ul><li>Internal fragmentation: unused space within the partition </li></ul><ul><li>Compaction: one way to solve external fragmentation but only possible if relocation is dynamic (WHY?) </li></ul><ul><li>Other methods: paging and segmentation </li></ul>
  139. 139. 9.4 Paging <ul><li>Paging: permits noncontiguous local address space of a process </li></ul><ul><li>Frames: divide the physical memory into fixed-sized blocks </li></ul><ul><li>Pages: divide the logical memory into fixed-sized blocks </li></ul><ul><li>Address=page-number+page-offset: page-number is an index into a page table </li></ul><ul><li>The page and frame sizes are determined by hardware. </li></ul><ul><li>FIG9.6, FIG9.7, FIG9.8 </li></ul>
  140. 140. Paging <ul><li>No external fragmentation but internal fragmentation still exists </li></ul><ul><li>To reduce internal fragmentation: small-sized page but increase the overhead of page table entry </li></ul><ul><li>What about on-the-fly page-size support? </li></ul><ul><li>With page: user <=> address-translation hardware <=> actual physical memory </li></ul><ul><li>Frame table: OS needs to know the allocation details of the physical memory ( FIG9.9 ) </li></ul>
  141. 141. Paging <ul><li>Structure of the page table </li></ul><ul><li>Registers: fast but expensive: suitable for small entries (256) </li></ul><ul><li>Page-table-base register (PTBR): points to the page table (which resides in the main memory): suitable for large entries (1M) but needs two memory access to access a byte </li></ul><ul><li>Using associated registers or translation look-aside buffers (TLBs) to speed up </li></ul><ul><li>Hit ratio: effective memory-access time ( FIG9.10 ) </li></ul>
  142. 142. Paging <ul><li>Protection </li></ul><ul><li>Protection bits: one bit to indicate a page to be read and write or read only </li></ul><ul><li>Valid-invalid bit: indicates whether the page is in the process’s logical address space FIG9.11 </li></ul><ul><li>Page-table length register (PTLR) to indicate the size of the page table: a process usually only uses a small fraction of the address space available to it </li></ul>
  143. 143. Paging <ul><li>Multilevel paging </li></ul><ul><li>Supporting large logic address space </li></ul><ul><li>The page table may be extremely large (32-bit: page-size(4k:2^12):page-table(1M:2^20) 4-bytes/page-table=>4Mbytes) </li></ul><ul><li>FIG9.12, FIG9.13 </li></ul><ul><li>How does multilevel paging affect system performance? 4-level paging=4 memory accesses </li></ul>
  144. 144. Paging <ul><li>Hashed page tables </li></ul><ul><li>handle page table Fig. 9.14 </li></ul><ul><li>Clustered page table: useful for sparse address spaces </li></ul>
  145. 145. Paging <ul><li>Inverted page table </li></ul><ul><li>A page entry=> millions of entries => consume a large amount of physical memory </li></ul><ul><li>Inverted page table: fixed-link between entry (index) of the page table and the physical memory </li></ul><ul><li>May need to search the whole page table sequentially </li></ul><ul><li>Using hashed table to speed up this search </li></ul><ul><li>FIG9.15 </li></ul>
  146. 146. Paging <ul><li>Shared pages </li></ul><ul><li>Reentrant code: is non-self-modifying code, it will never change during execution </li></ul><ul><li>If the code is reentrant, it can be shares </li></ul><ul><li>FIG9.16 </li></ul><ul><li>Inverted page tables have difficulty implementing shared memory. WHY? </li></ul><ul><li>Work for two virtual addresses that are mapped to one physical address </li></ul>
  147. 147. 9.5 Segmentation <ul><li>Segment: variable-sized (page: fixed-sized) </li></ul><ul><li>Each segment has a name and length </li></ul><ul><li>Segment table: base (starting physical address) and limit (length of the segment) </li></ul><ul><li>FIG9.18, 9.19 </li></ul><ul><li>Advantage: </li></ul><ul><li>1. association with protection(HOW?) the memory-mapping hardware can check the protection-bits associated with each segment-table entry </li></ul><ul><li>2. Permits the sharing of code or data ( FIG9.19 ). Need to search the shared segment’s number </li></ul>
  148. 148. Segmentation <ul><li>Fragmentation </li></ul><ul><li>May cause external fragmentation: when all blocks of free memory are too small to accommodate a segment </li></ul><ul><li>What’s the suitable segment size? </li></ul><ul><li>Per segment for each process <=> per segment for per byte </li></ul>
  149. 149. 9.6 Segmentation with Paging <ul><li>Local descriptor table (LDT): private to the process </li></ul><ul><li>Global descriptor table (GDT): shared among all processes </li></ul><ul><li>Linear address </li></ul><ul><li>FIG9.21 </li></ul>
  150. 150. Ch. 10 Virtual Memory 10.1 Background <ul><li>Virtual memory: execution of processes that may not be completely in memory </li></ul><ul><li>Programs size > physical memory size </li></ul><ul><li>Virtual space: programmers can assume they have unlimited memory for their programs </li></ul><ul><li>Increasing memory utilization and throughput: many programs can be resided in memory and run at the same time </li></ul><ul><li>Less IO would be needed to swap users’ programs into memory => run faster </li></ul><ul><li>Demand paging and demand segmentation (more complex due to varied sizes) </li></ul>FIG10.1
  151. 151. 10.2 Demand Paging <ul><li>Lazy swapper: never swaps a page into memory unless it is needed </li></ul><ul><li>Valid/invalid bit: indicates whether the page is in memory or not ( FIG10.3 ) </li></ul><ul><li>Handling a page fault (FIG10.4). </li></ul><ul><li>Pure demand page: never bring a page into memory until it is required (executed one page at a time) </li></ul><ul><li>One instruction may cause multiple page faults (1 page for instruction and several for data) : not so bad because Locality of reference! </li></ul>
  152. 152. Demand Paging <ul><li>EX: three-address instruction C=A+B: 1) fetch instruction, 2) fetch A, 3) fetch B, 4) add A, B, and 5) store to C. The worst case: 4 page-faults </li></ul><ul><li>The hardware for supporting demand paging: page table and secondary memory (disks) </li></ul><ul><li>Page-fault service: 1) interrupt, 2) read the page, and 3) restart the process </li></ul><ul><li>Effective access time (EAT): </li></ul><ul><li>ma: memory access time (10-200ns) </li></ul><ul><li>p: the probability of page fault (0<=p<=1) </li></ul><ul><li>EAT= (1-p)*ma + p*page fault time </li></ul><ul><li>= 100 + 24,999,900*p (ma=100ns, page fault time = 25ms) </li></ul><ul><li>=> p<0.0000004 (10% degradation)=> 1ma/2,500,000 to page fault </li></ul>Disk 8ms: ave latency 15ms: seek 1ms: transfer
  153. 153. 10.3 Page Replacement <ul><li>Over-allocating: increase the degree of multiprogramming </li></ul><ul><li>Page replacement: 1) find the desired page on disk, 2) find a free frame -if there is one then use it; otherwise, select a victim by applying a page replacement algorithm, write the victim page to the disk and update the page/frame table, 3) load the desired page to the free frame, 4) restart the process </li></ul><ul><li>Modify (dirty) bit: reduce the overhead: if the page is dirty (means it has been changed), in this case we have to write this page back to the disk. </li></ul>
  154. 154. Page Replacement <ul><li>Need a frame-allocation and a page-replacement algorithm: lowest page-fault rate </li></ul><ul><li>Reference string: page faults vs # frames analysis </li></ul><ul><li>FIFO page replacement: </li></ul><ul><li>Simple but not always good ( FIG10.8 ) </li></ul><ul><li>Belady’s anomaly: the page faults increase as the increase of # of frames!!! ( FIG10.9 ) </li></ul>
  155. 155. Page Replacement <ul><li>Optimal page replacement: </li></ul><ul><li>Replace the page that will not be used for the longest period of time ( FIG10.10 ) </li></ul><ul><li>Has the lowest page-fault rate for a fixed number of frames (the optimum solution) </li></ul><ul><li>Difficult to implement: WHY? => need to predict the future usage of the pages! </li></ul><ul><li>Can be used as a reference point! </li></ul>
  156. 156. Page Replacement <ul><li>LRU page replacement: </li></ul><ul><li>Replace the page has not been used for the longest period of time ( FIG10.11 ) </li></ul><ul><li>The results are usually good </li></ul><ul><li>How to implement it? 1) counter and 2) stack ( FIG10.12 ) </li></ul><ul><li>Stack algorithms (LRU) will not suffer from Belady’s anomaly </li></ul>
  157. 157. Page Replacement <ul><li>LRU approximation page replacement </li></ul><ul><li>Reference bit: set by hardware: indicates whether the page is referenced </li></ul><ul><li>Additional-reference-bits algorithm: at regular interval, the OS shifts the reference bit to the MSB of a 8-bit byte (11000000 has been used more recently than 01011111) </li></ul><ul><li>Second-chance algorithm: ref-bit=1, gives it a second chance and reset the ref-bit: uses a circular queue to implement it ( FIG10.13 ) </li></ul>
  158. 158. Page Replacement <ul><li>Enhanced second-chance algorithm </li></ul><ul><li>(0,0): neither recently used nor modified - best one to replace </li></ul><ul><li>(0,1): not recently used but modified - need to write back </li></ul><ul><li>(1,0): recently used but clean - probably will be used agaain </li></ul><ul><li>(1,1): recently used and modified </li></ul><ul><li>We may have to scan the circular queue several times before we can find the page to be replaced </li></ul>
  159. 159. Page Replacement <ul><li>Counting-based page replacement </li></ul><ul><li>the least frequently used (LFU) page-replacement algorithm </li></ul><ul><li>the most frequently used (MFU) page-replacement algorithm </li></ul><ul><li>Page-buffering algorithm </li></ul><ul><li>Keep a pool of free frame: we can write the page into a free frame before we need to write a page out of the frame </li></ul>
  160. 160. 10.4 Allocation of Frames <ul><li>How many free frames should each process get? </li></ul><ul><li>Minimum number of frames </li></ul><ul><li>It depends on the instruction-set architecture: we must have enough frames to hold all the pages that any single instruction can reference </li></ul><ul><li>It also depends on the computer architecture: ex. PDP11 some instructions have more than 1 word (it may straddle 2 pages) in which 2 operands may be indirect reference (4 pages) => needs 6 frames </li></ul><ul><li>Indirect address may cause problem (we can limit the levels of indirection e.g., 16) </li></ul>
  161. 161. Allocation of Frames <ul><li>Allocation Algorithms </li></ul><ul><li>Equal allocation </li></ul><ul><li>Proportional allocation: allocating memory to each process according to its size </li></ul><ul><li>Global allocation: allow high-priority processes to select frames from low-priority processes (problem? A process can not control its own page-fault rate) </li></ul><ul><li>Local allocation: each process selects from its own set of frames </li></ul><ul><li>Which one is better? Global allocation: high throughput </li></ul>
  162. 162. 10.5 Trashing <ul><li>Trashing: high paging activity (a severe performance problem) </li></ul><ul><li>A process is trashing if it is spending more time paging than executing </li></ul><ul><li>The CPU scheduler: decreasing CPU utilization => increases the degree of multiprogramming => more page faults => getting worse and worse ( FIG10.14 ) </li></ul><ul><li>Preventing trashing: we must provide a process as many frames as it needs </li></ul><ul><li>Locality: process executes from locality to locality </li></ul>
  163. 163. Trashing <ul><li>Suppose we allocate enough frames to a process to accommodate its current locality. It will not fault until it changes its localities </li></ul><ul><li>Working-set model => locality </li></ul><ul><li>Working-set: the most active-used pages within the working-set window (period) ( FIG10.16 ) </li></ul><ul><li>The accuracy of the working set depends on the selection of the working-set window (too small: will not encompass the entire locality; too large: overlay several localities) </li></ul>
  164. 164. Trashing <ul><li>WSS_i: the working-set size </li></ul><ul><li>D: the total demand for frames </li></ul><ul><li>D = sum WSS_i </li></ul><ul><li>If the total demand is greater than the total number of available frames (D>m), trashing will occur </li></ul><ul><li>If D>m allocate frames to processes else the OS suspends some processes </li></ul><ul><li>Difficulty: how to keep tracking of the working-set (it’s a moving window) </li></ul>
  165. 165. Trashing <ul><li>Page-fault frequency (PFF): establishing lower- and upper-bound on the desired page-fault rate ( FIG10.17 ) </li></ul><ul><li>Below lower-bound: the process may have too many frames (remove from it) </li></ul><ul><li>Over upper-bound: the process may not have enough frames (add to it) </li></ul>
  166. 166. 10.6 OS Examples <ul><li>NT: demand pages with clustering, working-set minimum/maximum, automatic working-set trimming </li></ul><ul><li>Solaris 2: minfree/lotsfree, pageout starts when reaches minfree, two-handed-clock algorithm </li></ul>
  167. 167. 10.7 Other Considerations <ul><li>Prepaging </li></ul><ul><li>bring into memory at one time all the pages that will be needed </li></ul><ul><li>s pages, alpha: the actual fraction of s will be used (0<=alpha<=1) </li></ul><ul><li>if alpha->0 prepaging loses, if alpha->1 prepaging wins </li></ul><ul><li>Page size </li></ul><ul><li>How to determine page size? </li></ul><ul><li>Size of the page table: large page size => small page table </li></ul>
  168. 168. Other Considerations <ul><li>Smaller page size => smaller internal fragmentation => better memory utilization </li></ul><ul><li>Page read/write time: large page size to minimize IO time </li></ul><ul><li>Smaller page size => better locality (resolution) => less IO time </li></ul><ul><li>The historical trend is toward large page sizes: WHY? High CPU/memory speed compared to disk speed but more internal fragmentation </li></ul>
  169. 169. Other Considerations <ul><li>Inverted page table: reduce the virtual-to-physical translation time </li></ul><ul><li>Program structure: increase locality => lower page-fault rate (e.g., stack is good, hashing is not good) </li></ul><ul><li>Compiler and loader also affect on paging: separating code (it will never be modified) and data </li></ul><ul><li>Frequent use of pointers (C, C++) tends to randomize memory access: not good; Java is better : no pointers </li></ul>
  170. 170. Other Considerations <ul><li>IO interlock </li></ul><ul><li>Allow some of the pages to be locked in memory (for IO operations) </li></ul><ul><li>It may be dangerous because it may get turned on but never turned off </li></ul>
  171. 171. Ch. 11 File Systems 11.1 File Concept <ul><li>File: a named collection of related information that is recorded on secondary storage </li></ul><ul><li>Types: text (source), object (executable) files </li></ul><ul><li>Attributes: name, type, location, size, protection, time/date/user id </li></ul><ul><li>Operations: creating, writing, reading, repositioning, deleting, truncating (delete the content only), appending,a renaming, copy </li></ul><ul><li>Info associated with an open file: file pointer, file open count, disk location of the file </li></ul>
  172. 172. File Concept <ul><li>Memory mapping: multiple processes may be allowed to map the same files into the virtual memory of each, to allow sharing of data </li></ul><ul><li>File types: name => a name + an extension </li></ul><ul><li>File structure: more structures, the OS needs to support more => support minimum number of file structures (UNIX, MS-DOS 8-bit byte no interpretation): each application program needs to provide its own code to interpret an input file to the appropriate structure </li></ul>
  173. 173. File Concept <ul><li>Internal file structure: packing a number of logical records into physical blocks (all file systems suffer internal fragmentation problem) </li></ul><ul><li>Consistency semantics: modifications of data by one user should be observed by other users </li></ul><ul><li>UNIX: writes to an open file by one user are visible immediately by other users; sharing the pointer of current location into the file </li></ul>
  174. 174. 11.2 Access Methods <ul><li>Sequential access (order) </li></ul><ul><li>Direct access (no order, random): relative block number - an index relative to the beginning of the file </li></ul><ul><li>Index file: contains pointers to various blocks </li></ul><ul><li>Index-index file (if index is too big) </li></ul>
  175. 175. 11.3 Directory Structure <ul><li>The file system is divided into partitions (IBM:minidisks, PC/Macintosh: volumes) </li></ul><ul><li>Partitions can be thought as virtual disks </li></ul><ul><li>Device directory or volume table of contents </li></ul><ul><li>Operations on directory: search a file, create a file, delete a file, list a directory, rename a file, traverse the file system </li></ul>
  176. 176. Directory Structure <ul><li>Single-level directory </li></ul><ul><li>Simple but not good when too many files or too many users </li></ul><ul><li>All files in the same directory that need unique names </li></ul><ul><li>Two-level directory </li></ul><ul><li>User has his/her own user file directory (UFD) under the system’s master file directory (MFD) </li></ul><ul><li>Solving name-collision problem but isolating user from each other (not good for cooperation) </li></ul><ul><li>Path name, search path </li></ul>
  177. 177. Directory Structure <ul><li>Tree-structured directory </li></ul><ul><li>A directory contains a set of files or subdirectories </li></ul><ul><li>Each user has a current directory </li></ul><ul><li>Path names: absolute-begins at the root (root/spell/mail); relative-from the current directory (prt/first => root/spell/mail/prt/first) </li></ul><ul><li>How to delete a directory? : directory must be empty: what if the directory contains several subdirectories? </li></ul><ul><li>UNIX “rm -r”: remove all but dangeous </li></ul><ul><li>Users can access their and all others’ files </li></ul>
  178. 178. Directory Structure <ul><li>Acyclic-graph directories </li></ul><ul><li>Can a tree structure share files and directories? NO </li></ul><ul><li>A graph with no cycles, allows directories to have shared directories and files ( FIG11.9 ) </li></ul><ul><li>UNIX: link: a pointer to another file or subdirectory </li></ul><ul><li>Symbolic link: a link is implemented as an absolute or relative path name </li></ul><ul><li>Duplicate all info in both sharing directories: consistency issue </li></ul>
  179. 179. Directory Structure <ul><li>Concerns for implementing an acyclic-graph directory </li></ul><ul><li>1. A file may have multiple absolute path name </li></ul><ul><li>2. When can the space allocated to a shared file can be deleted and reused? </li></ul><ul><li>It is easy to handle them by using symbolic link </li></ul><ul><li>Need a mechanism to indicate whether all the references to it are deleted </li></ul><ul><li>1. File-reference list: potential large size </li></ul><ul><li>2. Counter: 0=no more reference: cheaper (UNIX: hard link) </li></ul>
  180. 180. Directory Structure <ul><li>Acyclic-graph is complicate, some systems do not allow shared directories or links (MS-DOS: tree) </li></ul><ul><li>General graph directory </li></ul><ul><li>Problem with acyclic graph is difficult to ensure that there are no cycles? WHY? </li></ul><ul><li>Adding links to a tree-structured directory becomes general graph directory ( FIG11.10 ) </li></ul><ul><li>Cycle problem: one solution is to limit the depth (number of directories) of search </li></ul><ul><li>Garbage collection: 1st pass identify files/directories can be freed, 2nd pass free the space: time consuming </li></ul>
  181. 181. 11.4-5 File-System Mounting & Sharing <ul><li>File opened for use; file mounting for available for processes </li></ul><ul><li>Multiple users sharing: security and consistency </li></ul><ul><li>Remote file systems </li></ul>
  182. 182. 11.6 Protection <ul><li>Reliability: guarding against physical damage (duplicate copies of files) </li></ul><ul><li>Protection: guarding against improper access </li></ul><ul><li>Controlled access: read, write, execute, append, delete, list </li></ul><ul><li>Access list and group: owner, group, universe (UNIX: rwx) </li></ul><ul><li>Other protection approaches: one password for every file (who can remember so many passwords?), one password for a set of files </li></ul>
  183. 183. CH12 File-System Implementation 12.1 File-System Structure <ul><li>To improve IO efficiency, blocks transfer between disk and memory </li></ul><ul><li>File-systems: allow the data to be stored, located, and retrieved easily. </li></ul><ul><li>Two issues: 1) how to look to a user and 2) must develop algorithms and data structures to implement it </li></ul><ul><li>A layer design: application programs => logical file system (directories) => file-organization module => basic file system => IO control => devices </li></ul>
  184. 184. 12.2 File-System Implementation <ul><li>Open-file table: files must be opened before they can be used for IO procedures </li></ul><ul><li>File descriptor, file handle (NT), file control block </li></ul><ul><li>Mounted: the file system must be mounted before it can be available to processes on the system (/home/jane = /user/jane) </li></ul>
  185. 185. 12.3 Directory Implementation <ul><li>Linear list </li></ul><ul><li>Hash table </li></ul>
  186. 186. 12.4 Allocation Methods <ul><li>Contiguous, linked, and indexed </li></ul><ul><li>Contiguous allocation </li></ul><ul><li>Each file to occupy a set of contiguous blocks on the disk ( FIG12.5 ) </li></ul><ul><li>The number of disk seeks is minimum </li></ul><ul><li>It’s easy to access a file which is allocated to a set of contiguous blocks </li></ul><ul><li>Support both sequential and direct access </li></ul><ul><li>Difficult to find space for a new file (dynamic storage-allocation algorithm: best-fit, first-fit, worst-fit) </li></ul>
  187. 187. Allocation Methods <ul><li>Cause external fragmentation </li></ul><ul><li>Run a repacking routine: disk->memory->disk: effective but time consuming (need down time) </li></ul><ul><li>How much space is needed for a file? </li></ul><ul><li>1. Too little: the file can not be extended </li></ul><ul><li>2. Too much: waste </li></ul><ul><li>Re-allocation: find a large hold, and copy the contents, and repeat the process: slow!!! </li></ul><ul><li>Preallocation and then extent (with a link) </li></ul>
  188. 188. Allocation Methods <ul><li>Linked allocation </li></ul><ul><li>Each block contains a pointer to the next block ( FIG12.6 ) </li></ul><ul><li>No external fragmentation: no need to compact space </li></ul><ul><li>Disadvantages: </li></ul><ul><li>1 can be used effectively only to sequential-access files and inefficient to support direct-access files </li></ul><ul><li>2. Overhead for the pointers </li></ul>
  189. 189. Allocation Methods <ul><li>One solution to the problems: collect blocks into clusters </li></ul><ul><li>Reliability is another problem: what if a link is missing? </li></ul><ul><li>File allocation table (FAT): MS-DOC and OS/2: FIG12.7 </li></ul><ul><li>FAT allocation scheme may need a significant number of disk head seek, unless FAT is cached </li></ul>
  190. 190. Allocation Methods <ul><li>Indexed allocation </li></ul><ul><li>Index block: bringing all the pointer into one place </li></ul><ul><li>Each file has its own index block that contains a array of disk-block addresses ( FIG12.8 ) </li></ul><ul><li>Support direct access, no external fragmentation </li></ul><ul><li>Overhead of index block > pointers of linked allocation </li></ul><ul><li>How large the index block should be? </li></ul><ul><li>1. Linked scheme </li></ul><ul><li>2. Multilevel index </li></ul><ul><li>3. Combined scheme ( FIG12.9 ) </li></ul>
  191. 191. Allocation Methods <ul><li>Performance </li></ul><ul><li>Contiguous allocation: needs one access to get a disk block </li></ul><ul><li>Linked allocation: needs i accesses to get the i th block: no good for direct access applications </li></ul><ul><li>Direct-access files using contiguous allocation, sequential-access using linked allocation </li></ul><ul><li>Indexed allocation: depends on the index structure, file sizes, etc. (What if the index block is too big which can not stay in the memory all time? Swap-in and swap out the index block???) </li></ul>
  192. 192. 12.5 Free-Space Management <ul><li>Free-space list </li></ul><ul><li>1. Bit vector: each bit represents one block: simple: but only effective if entire bit-vector stays in memory (1.3G/512-block/bit-vector=332k) </li></ul><ul><li>2. Linked list: traversal needs substantial IO time but luckily it is not a frequent action </li></ul><ul><li>3. Grouping: store the addresses of n-free blocks in the first block (n-1 actual free blocks): a large number of free blocks can be found quickly </li></ul><ul><li>4. Counting: instead keep all the addresses, we only need to store the address of 1st block and count n </li></ul>
  193. 193. 12.6 Recovery <ul><li>Consistency checking: comparing the data in the directory structure with the data blocks on disk </li></ul><ul><li>The loss of a directory entry on an indexed allocation system could be disastrous </li></ul><ul><li>Backup and restore </li></ul>
  194. 194. Ch. 13 IO Systems 13.1 Overview <ul><li>IO devices vary widely in their function and speed. Hence we need a variety of methods to control them </li></ul><ul><li>Two conflicting trends: 1) increasing standardization of hardware and software interface and 2) increasing broad variety of IO devices </li></ul><ul><li>Device-driver modules: kernel, encapsulation </li></ul>
  195. 195. 13.2 IO Hardware <ul><li>Many types of devices: storage devices (sidks, tapes), transmission devices (modems, network cards), human interface devices (mouse, screen, keyboard) </li></ul><ul><li>Port, bus, daisy chain (a serial connected devices) </li></ul><ul><li>Controller: a serial-port controller </li></ul><ul><li>Host adapter: contains processor, microcode, memory for complex protocols (SCSI) </li></ul><ul><li>PC bus structure: FIG13.1 </li></ul>
  196. 196. IO Hardware <ul><li>How can the processor give commands and data to a controller to accomplish an IO transfer? </li></ul><ul><li>Special IO instructions </li></ul><ul><li>Memory-mapped IO: faster if need to transfer a large amount of data (screen display). Disadvantage? Software fault! </li></ul><ul><li>IO port (4 registers): status, control, data-in and data-out </li></ul><ul><li>Some controllers use FIFO </li></ul>
  197. 197. IO Hardware <ul><li>Polling </li></ul><ul><li>Handshaking: in the first step, the host repeatedly monitors the busy bit: busy-waiting or polling (go to door every 1 minute to check out whether someone is on the door) </li></ul><ul><li>Interrupts </li></ul><ul><li>Interrupt-request line: someone ring a door bell to indicate he/she is on the door </li></ul><ul><li>Interrupt-driven IO cycle: FIG13.3 </li></ul>
  198. 198. IO Hardware <ul><li>Interrupt-request lines: nonmaskable interrupt-reserved for events such as nonrecoverable errors; maskable interrupt- can be turned off by CPU </li></ul><ul><li>Interrupt vector: the memory addresses of specialized interrupt handlers </li></ul><ul><li>Interrupt priority levels </li></ul><ul><li>Interrupt in a OS: at boot time, during IO, exceptions </li></ul><ul><li>Other usage of interrupts: page-fault (virtual memory), system calls (software interrupt or trap), manage the control flow (yield some low-priority job to high-priority one) </li></ul>
  199. 199. IO Hardware <ul><li>A threaded kernel architecture is well-suite to implement multiple interrupt priority and to enforce the precedence of interrupt handling over background processing in kernel and application routines </li></ul><ul><li>Direct memory access ( FIG13.5 ) </li></ul><ul><li>Programmed IO (PIO): 1 byte transfer at a time </li></ul><ul><li>DMA controller operates on memory bus directly </li></ul><ul><li>DMA seizes the memory bus => the CPU can not access the memory => it still can access data in the primary and secondary cache => CYCLE STEALING </li></ul>
  200. 200. 13.3 Application IO Interface <ul><li>A kernel IO structure ( FIG13.6 ): encapulation </li></ul><ul><li>Devices’ characteristics ( FIG13.7 ) </li></ul><ul><li>1. Character stream or block </li></ul><ul><li>2. Sequential or random-access </li></ul><ul><li>3. Synchronous or asynchronous </li></ul><ul><li>4. Sharable (can be used by concurrent threads) or dedicated </li></ul><ul><li>5. Speed of operation </li></ul><ul><li>6. Read/write, read only or write only </li></ul><ul><li>Escape or back-door system call (UNIX ioctl) </li></ul>
  201. 201. Application IO Interface <ul><li>Block and character devices </li></ul><ul><li>Block-device: read, write, seek (memory-mapped file access can be layered on top of block-device drivers </li></ul><ul><li>Character-stream: keyboard, mouse, modems </li></ul><ul><li>Network devices </li></ul><ul><li>Network socket interface (UNIX, NT) </li></ul><ul><li>Clocks and timers </li></ul><ul><li>Give the current time, the elapsed time, set a timer to trigger operation X at time T (programmable interval timer) </li></ul>
  202. 202. Application IO Interface <ul><li>Blocking and nonblocking IO </li></ul><ul><li>Blocking IO system call: the execution of the application is suspended (run->wait queue) </li></ul><ul><li>Nonblocking IO: ex. Interface with mouse and keyboard while processing and display data on the screen: </li></ul><ul><li>1. One solution: overlap execution with IO using a multithreaded application </li></ul><ul><li>2. Asynchronous system call: returns immediately without waiting for the IO to complete </li></ul>
  203. 203. 13.4 Kernel IO Subsystem <ul><li>IO scheduling </li></ul><ul><li>To improve the overall performance </li></ul><ul><li>Buffering </li></ul><ul><li>Cope with speed mismatching </li></ul><ul><li>Cope with different data-transfer sizes </li></ul><ul><li>Support copy semantics for application IO </li></ul><ul><li>Double buffering: write to one and read from another one </li></ul>
  204. 204. Kernel IO Subsystem <ul><li>Caching </li></ul><ul><li>Using a fast memory to hold copies of data </li></ul><ul><li>Spooling and device reservation </li></ul><ul><li>A spool: is a buffer that holds output for a device such as a printer (serve one job at a time) </li></ul><ul><li>Each application’s output is spooled to a separate disk file. </li></ul><ul><li>Error handling </li></ul><ul><li>An OS uses protected memory </li></ul><ul><li>Sense key (SCSI protocol to identify failures) </li></ul>
  205. 205. Kernel IO Subsystem <ul><li>Kernel data structures </li></ul><ul><li>UNIX IO kernel structure (FIG13.9) </li></ul><ul><li>The IO subsystem supervises: </li></ul><ul><li>1. Management of the name space for files/devices </li></ul><ul><li>2. Access control to files/devices </li></ul><ul><li>3. Operation control </li></ul><ul><li>4. File system space allocation </li></ul><ul><li>5. Device allocation </li></ul><ul><li>6. Buffering, caching, and spooling </li></ul><ul><li>7. IO scheduling </li></ul><ul><li>8. Device status monitoring, error handling and failure recovery </li></ul><ul><li>9. Device driver configuration and initialization </li></ul>
  206. 206. 13.5 IO Requests Handling <ul><li>How’s the connection made from the file name to the disk controller? </li></ul><ul><li>Device table (MS-DOS): “c:” </li></ul><ul><li>Mount table (UNIX): </li></ul><ul><li>Stream (UNIX V): a full-duplex connection between a device driver and a user-level process </li></ul><ul><li>The life cycle of an IO request ( FIG13.10 ) </li></ul>
  207. 207. 13.6-7 STREAMS & Performance <ul><li>IO is a major factor in system performance </li></ul><ul><li>Context switching is a main factor </li></ul><ul><li>Interrupt is relatively expensive: state change => execute the interrupt handler => restore the state </li></ul><ul><li>Network traffic can also cause a high context-switch rate ( FIG13.11 ) </li></ul><ul><li>Telnet daemon (Solaris): using in-kernel threads to eliminate the context switches </li></ul><ul><li>Front-end processor, terminal concentrator (multiplexing many remote terminals to one port), IO channel </li></ul>
  208. 208. Performance <ul><li>Several principles to improve efficiency of IO: </li></ul><ul><li>Reduce # of context switches </li></ul><ul><li>Reduce # of times that data needs to be copied to memory while they are passing between device and application </li></ul><ul><li>Reduce the frequency of interrupts </li></ul><ul><li>Increasing usage of DMA </li></ul><ul><li>Move processing primitives into hardware (allowing concurrent CPU and bus operations) </li></ul><ul><li>Balance load between CPU, memory subsystem, and IO </li></ul>
  209. 209. Performance <ul><li>Device-functionality progression ( FIG13.12 ) </li></ul><ul><li>Where should the IO functionality be implemented??? </li></ul><ul><li>Application-level: easy, flexible, inefficient (high context switches) </li></ul><ul><li>Kernel level: difficulty but efficient </li></ul><ul><li>Hardware level: inflexible, expensive </li></ul>
  210. 210. Ch. 14 Mass-Storage Structure 14.1 Disk Structure <ul><li>Magnetic tape (slower than disk): backup use </li></ul><ul><li>Converting logical block to disk: two problems: 1) most disks have some defects and 2) the # of sectors per track is not a constant </li></ul><ul><li>Cylinder, track, sector </li></ul>
  211. 211. 14.2 Disk Scheduling <ul><li>Seek time: the time for the disk arm to move the heads to the cylinder containing the desired sector </li></ul><ul><li>Rotational latency: the time waiting for the disk to rotate the desired sector to the head </li></ul><ul><li>Bandwidth: the total # bytes transferred, divided by the total time from request to completion </li></ul><ul><li>FCFS scheduling (first-come-first-serve) </li></ul><ul><li>Simple but not generally provide fastest service ( FIG14.1 ) </li></ul>
  212. 212. Disk Scheduling <ul><li>SSTF scheduling (shortest-seek-time-first) </li></ul><ul><li>Selects the request with the minimum seek time from the current head position ( FIG14.2 ) </li></ul><ul><li>It may cause starvation of some requests </li></ul><ul><li>It is much better than FCFS but not optimal </li></ul><ul><li>SCAN scheduling </li></ul><ul><li>The head continuously scans back and forth across the disk ( FIG14.3 ): also called elevator algorithm </li></ul><ul><li>C-SCAN scheduling </li></ul><ul><li>Reach one end immediately return to the other end ( FIG14.4 ) </li></ul>
  213. 213. Disk Scheduling <ul><li>LOOK scheduling </li></ul><ul><li>Reach the final request and return (without going to the end of the disk) ( FIG14.5 ) </li></ul><ul><li>Selection of a disk-scheduling algorithm </li></ul><ul><li>Heavy load: SCAN and C-SCAN perform better due to no starvation problem </li></ul><ul><li>Final a optimal scheduling? Computational expensive may not justify the saving over SSTF or SCAN </li></ul><ul><li>Request performance may affect by the file-allocation method (continuous vs linked/indexed files) </li></ul>
  214. 214. Disk Scheduling <ul><li>Request performance is also affect by the location of directories and index blocks (e.g., the index block is on the 1st cylinder and the data is on the last one) </li></ul><ul><li>Caching the directories and index blocks in main memory will help </li></ul><ul><li>OS should have a module that includes a set of scheduling algorithms for different applications </li></ul><ul><li>What if the rotational latency is nearly as large as the average seek time? </li></ul>
  215. 215. 14.3 Disk Management <ul><li>Disk formatting: low-level formatting: physical formatting: divide the disk into sectors that the disk controller can read and write from/to it </li></ul><ul><li>Error-correcting code (ECC): write data => the controller computes the ECC => read data => compute the ECC => if match ok => if not error and correct it </li></ul><ul><li>Using a disk to hold file: needs to record its own data structures on the disk </li></ul><ul><li>1. Partition the disk into several groups of cylinders </li></ul><ul><li>2. Logical formatting (making a file system) </li></ul>
  216. 216. Disk Management <ul><li>Boot block (stored in ROM), bootstrap loader, boot disk (system disk) </li></ul><ul><li>Bad blocks: why disks are easily defected? (moving part) </li></ul><ul><li>What if format finds a bad block? Making the FAT entry and </li></ul><ul><li>1. Sector sparing (forwarding): OS preserves some spare sectors to replace bad blocks </li></ul><ul><li>2. Sector slipping: 17 is bad, 17-100 move to 18-101 </li></ul><ul><li>Can the replacement be fully automatic? No. users may need to fix the lost data manually </li></ul>
  217. 217. 14.4 Swap-Space Management <ul><li>Main goal: provide the best throughput for the virtual-memory system </li></ul><ul><li>Swap-space: range from a few megabytes to hundreds of megabytes: is safer to overestimate than to underestimate the swap space. WHY? </li></ul><ul><li>Swap-space location: in the file systems (easy but inefficient) or in a separate disk partition (efficient but internal fragmentation may increase) </li></ul><ul><li>In 4.3BSD, swap space is allocated to a process when it is started: text segment ( FIG14.7 ) and data segment ( FIG14.8 ) = swap map </li></ul>
  218. 218. 14.5 RAID <ul><li>Disks used to be the least reliable component of a system (disk crash) </li></ul><ul><li>Disk stripping (interleaving): uses a group of disks as one storage unit: improve performance and reliability </li></ul><ul><li>Redundant array of independent disks (RAID) </li></ul><ul><li>Mirroring or shadowing: keeps a duplicate copy of each disk </li></ul><ul><li>Block interleaved parity: a small fraction of the disk space is used to hold parity blocks </li></ul><ul><li>What are the overheads? </li></ul>
  219. 219. 14.7 Stable-Storage Implementation <ul><li>Information resided in the stable storage is never lost </li></ul><ul><li>Whenever a failure occurs when writing of a block, the system will recovery and restore the block </li></ul><ul><li>1. Write info to the first physical block </li></ul><ul><li>2. If it completed successfully, write the same info to the second physical block </li></ul><ul><li>3. Declare the operation complete only after the second write completes successfully </li></ul><ul><li>Data will be safe unless all copies are destroyed </li></ul>
  220. 220. 14.8 Tertiary-Storage Structure <ul><li>Removable media </li></ul><ul><li>Removable disks </li></ul><ul><li>Tapes </li></ul><ul><li>What are the considerations of the OS? </li></ul><ul><li>Application interface </li></ul><ul><li>File naming </li></ul>