2. 1: Introduction
Overview
The UNIX Process
What is a Process
Representing a process
States of a process
Creating and managing
processes
fork()
wait()
getpid()
exit()
etc.
Files in UNIX
Directories and Paths
User vs. System Mode
UNIX I/O primitives
A simple example
The basic I/O system
calls
Buffered vs. un-
buffered I/O
3. 1: Introduction
UNIX Processes
A program that has started is manifested
in the context of a process.
A process in the system is represented:
Process Identification Elements
Process State Information
Process Control Information
User Stack
Private User Address Space, Programs and Data
Shared Address Space
4. 1: Introduction
Process Control Block
Process Information, Process State Information,
and Process Control Information constitute the
PCB.
All Process State Information is stored in the
Process Status Word (PSW).
All information needed by the OS to manage the
process is contained in the PCB.
A UNIX process can be in a variety of states:
5. 1: Introduction
States of a UNIX Process
• User running: Process executes in user mode
• Kernel running: Process executes in kernel mode
• Ready to run in memory: process is waiting to be scheduled
• Asleep in memory: waiting for an event
• Ready to run swapped: ready to run but requires swapping in
• Preempted: Process is returning from kernel to user-mode
but the system has scheduled another process instead
• Created: Process is newly created and not ready to run
• Zombie: Process no longer exists, but it leaves a record for
its parent process to collect.
See Process State Diagram!!
7. 1: Introduction
Creating a new process
In UNIX, a new process is created by means of the fork() -
system call. The OS performs the following functions:
• It allocates a slot in the process table for the new process
• It assigns a unique ID to the new process
• It makes a copy of process image of the parent (except
shared memory)
• It assigns the child process to the Ready to Run State
• It returns the ID of the child to the parent process, and 0
to the child.
Note, the fork() call actually is called once but returns twice -
namely in the parent and the child process.
8. 1: Introduction
Fork()
Pid_t fork(void) is the prototype of the fork() call.
Remember that fork() returns twice
in the newly created (child) process with return value 0
in the calling process (parent) with return value = pid of
the new process.
A negative return value (-1) indicates that the call has
failed
Different return values are the key for
distinguishing parent process from child process!
The child process is an exact copy of the parent,
yet, it is a copy i.e. an identical but separate
process image.
9. 1: Introduction
A fork() Example
#include <unistd.h>
main()
{
pid_t pid /* process id */
printf(“just one process before the fork()n”);
pid = fork();
if(pid == 0)
printf(“I am the child processn”);
else if(pid > 0)
printf(“I am the parent processn”);
else
printf(“DANGER Mr. Robinson - the fork() has failedn”)
}
10. 1: Introduction
Basic Process Coordination
The exit() call is used to terminate a process.
Its prototype is: void exit(int status), where status is
used as the return value of the process.
exit(i) can be used to announce success and failure to the
calling process.
The wait() call is used to temporarily suspend the
parent process until one of the child processes
terminates.
The prototype is: pid_t wait(int *status), where status is
a pointer to an integer to which the child’s status
information is being assigned.
wait() will return with a pid when any one of the children
terminates or with -1 when no children exist.
11. 1: Introduction
more coordination
To wait for a particular child process to
terminate, we can use the waitpid() call.
Prototype: pid_t waitpid(pid_t pid, int *status, int opt)
Sometimes we want to get information about the
process or its parent.
getpid() returns the process id
getppid() returns the parent’s process id
getuid() returns the users id
use the manual pages for more id information.
12. 1: Introduction
Orphans and Zombies or MIAs
A child process whose parent has terminated is
referred to as orphan.
When a child exits when its parent is not currently
executing a wait(), a zombie emerges.
A zombie is not really a process as it has terminated but
the system retains an entry in the process table for the
non-existing child process.
A zombie is put to rest when the parent finally executes
a wait().
When a parent terminates, orphans and zombies
are adopted by the init process (prosess-id -1) of
the system.
13. 1: Introduction
Fun with Zombies!
Consider the following program:
#include <unistd.h>
main()
{
pid_t pid;
int num1, num2;
pid = fork();
printf(“pid=%d: enter 2 numbers:”, getpid());
scanf(“%d %d”, &num1 &num2);
printf(“n pid=%d: the sum is:n”, getpid(), num1
+ num2);
}
14. 1: Introduction
What is going on??
Who is getting to write to the standard output -
parent or child ??
Who is going to receive the first input ?
What happens if the child process completes while
the parent is still waiting for input ?
What UNIX tools would you use to get some
indication of what is going on ?
Exercise: Go try it !!
15. 1: Introduction
The exec() family
Why is the basic fork() facility not enough for a
serious system programmer?
Remember, a process consists among other things
of text and data segments that represent the
running program!
What if we could change the program code that
the text-segment represents?
===> The process would of course execute a
different program.
16. 1: Introduction
notable exec properties
an exec call transforms the calling process by
loading a new program in its memory space.
the exec does not create a new sub-process.
unlike the fork there is no return from a
successful exec.
all types of exec calls perform in principle the
same task.
----> see chapter 2.7 in the textbook for a
detailed description of the various exec() formats.
17. 1: Introduction
exec??() calls
execl(): must be given the arguments as a NULL
terminated list. It needs a valid pathname for the
executable.
execlp(): only needs the filename of the
executable but an NULL terminated argument list.
execv():must receive an array of arguments in
addition to the path of the executable
execvp():only needs the filename of the
executable and the corresponding argument list
18. 1: Introduction
exec() example:
#include <unistd.h>
main()
{
printf(“executing lsn”);
execl(“/bin/ls”, “ls”, “-l”, (char *)0);
/* if execl returns, then the call has failed.... */
perror(“execl failed to run ls”);
exit(1)
}
19. 1: Introduction
By Convention
Consider int execl ( const char *path, const char
*arg0 ... )
Here, the parameter arg0 is, by convention, the
name of the program or command without any
path-information.
Example:
execl( “/bin/ls”, “ls”, “-l”, (char *)0 );
In general, the first argument in a command-line
argument list is the name of the command or
program itself!
20. 1: Introduction
An execvp() example
#include <unistd.h>
main()
{
char * const av[] = {“myecho”,”hello”,”world”, (char
*) 0};
execvp(av[0], av);
}
where myecho is name of a program that lists all its
command line arguments.
21. 1: Introduction
ps and kill
Two of the shell commands that you will be using
when working with processes are ps and kill.
ps reports on active processes.
• -A lists all processes
• -e includes the environment
• -u username, lists all processes associated with
username
NOTE: the options may be differ on different
systems!!
What if you get a listing that is too long?
• try ps -A | grep username
22. 1: Introduction
kill and killall
In order to communicate with an executing
process from the shell, you must send a signal to
that process.
The kill command sends a signal to the process.
You can direct the signal to a particular process by
using its pid
The kill command has the following format:
kill [options] pid
-l lists all the signals you can send
-p (on some systems prints process informantion)
-signal is a signal number
23. 1: Introduction
more on kill
To terminate a process you can send the HUB
signal, which is signal number 9.
Example: kill -9 24607 will force the process
with pid 24607 to terminate.
On LINUX you have the option to use killall to kill
all processes that are running and are owned by
you.
USE THE KILL-COMMAND TO TERMINATE ANY
UNWANTED BACKGROUND PROCESSES !!!
24. 1: Introduction
Files
UNIX Input/Output operations are based on the
concept of files.
Files are an abstraction of specific I/O devices.
A very small set of system calls provide the
primitives that give direct access to I/O facilities
of the UNIX kernel.
Most I/O operations rely on the use of these
primitives.
We must remember that the basic I/O primitives
are system calls, executed by the kernel. What
does that mean to us as programmers???
25. 1: Introduction
User and System Space
Program
Code
Library Routine
fread()
read()
user code
read()
kernel code
Kernel Space
User Space
26. 1: Introduction
Different types of files
UNIX deals with two different classes of files:
Special Files
Regular Files
Regular files are just ordinary data files on disk -
something you have used all along when you studied
programming!
Special files are abstractions of devices. UNIX
deals with devices as if they were regular files.
The interface between the file system and the
device is implemented through a device driver - a
program that hides the details of the actual
device.
27. 1: Introduction
special files
UNIX distinguishes two types of special
files:
Block Special Files represent a device with
characteristics similar to a disk. The device
driver transfers chunks or blocks of data
between the operating system and the device.
Character Special Files represent devices with
characteristics similar to a keyboard. The
device is abstracted by a stream of bytes that
can only be accessed in sequential order.
28. 1: Introduction
Access Primitives
UNIX provides access to files and devices
through a (very) small set of basic system
calls (primitives)
create()
open()
close()
read()
write()
ioctl()
29. 1: Introduction
the open() call
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *path, int flags, [mode_t mode]);
char *path: is a string that contains the fully
qualified filename of the file to be opened.
int flags: specifies the method of access i.e.
read_only, write_only read_and_write.
mode_t mode: optional parameter used to set the
access permissions upon file creation.
30. 1: Introduction
read() and write()
#include <unistd.h>
ssize_t read(int filedes, void *buffer, size_t n);
ssize_t write(int filedes, const void *buffer, size_t
n);
int filedes: file descriptor that has been obtained
though an open() or create() call.
void *buffer: pointer to an array that will hold the
data that is read or holds the data to be written.
size_t n: the number of bytes that are to be read or
written from/to the file.
31. 1: Introduction
A close() call
Although all open files are closed by the OS upon completion of
the program, it is good programming style to “clean up” after
you are done with any system resource.
Please make it a habit to close all files that you program has
used as soon as you don’t need them anymore!
#include <unistd.h>
int close(int filedes);
Remember, closing resources timely can improve system
performance and prevent deadlocks from happening (more
later)
32. 1: Introduction
A rudimentary example:
#include <fcntl.h> /* controls file attributes */
#include<unistd.h> /* defines symbolic constants */
main()
{
int fd; /* a file descriptor */
ssize_t nread; /* number of bytes read */
char buf[1024]; /* data buffer */
/* open the file “data” for reading */
fd = open(“data”, O_RDONLY);
/* read in the data */
nread = read(fd, buf, 1024);
/* close the file */
close(fd);
}
33. 1: Introduction
Buffered vs unbuffered I/O
The system can execute in user mode or kernel mode!
Memory is divided into user space and kernel space!
What happens when we write to a file?
• the write call forces a context switch to the system.
What??
• the system copies the specified number of bytes from user
space into kernel space. (into mbufs)
• the system wakes up the device driver to write these mbufs
to the physical device (if the file-system is in synchronous
mode).
• the system selects a new process to run.
• finally, control is returned to the process that executed
the write call.
Discuss the effects on the performance of your program!
34. 1: Introduction
Un-buffered I/O
Every read and write is executed by the kernel.
Hence, every read and write will cause a context
switch in order for the system routines to
execute.
Why do we suffer performance loss?
How can we reduce the loss of performance?
==> We could try to move as much data as possible
with each system call.
How can we measure the performance?
35. 1: Introduction
Buffered I/O
explicit versus implicit buffering:
explicit - collect as many bytes as you can before writing
to file and read more than a single byte at a time.
However, use the basic UNIX I/O primitives
• Careful !! Your program my behave differently on different
systems.
• Here, the programmer is explicitly controlling the buffer-
size
implicit - use the Stream facility provided by <stdio.h>
FILE *fd, fopen, fprintf, fflush, fclose, ... etc.
a FILE structure contains a buffer (in user space) that is
usually the size of the disk blocking factor (512 or 1024)
36. 1: Introduction
The fcntl() system call
The fcntl() system call provides some
control over already open files. fcntl() can
be used to execute a function on a file
descriptor.
The prototype is: int fcntl(int fd, int cmd,
.......) where
fd is the corresponding file descriptor
cmd is a “pre-defined” command (integer const)
.... are additional parameters that depend on
what cmd is.
37. 1: Introduction
The fcntl() system call
Two important commands are: F_GETFL and
F_SETFL
F_GETFL is used to instruct fcntl() to return
the current status flags
F_SETFL instructs fcntl() to reset the file
status flags
Example: int i = fcntl(fd, F_GETFL); or
int i = fcntl(fd, F_SETFL,
38. 1: Introduction
Using fcntl() to change to
non-blocking I/O
We can use the fcntl() system call to
change the blocking behavior of the read()
and write():
Example:
#include <fcntl.h>
…..
if ( fcntl(filedes, F_SETFL, O_NONBLOCK) == -
1)
perror(“fcntl”)
39. 1: Introduction
The select() call
Suppose we are dealing with a server process that
is supporting multiple clients concurrently. Each of
the client processes communicates with the server
via a pipe.
Further let us assume that the clients work
completely asynchronously, that is, they issue
requests to the server in any order.
How would you write a server that can handle this
type of scenario? DISCUSS!!
40. 1: Introduction
select() cont’d
What exactly is the problem?
If we are using the standard read() call, it will
block until data is available in the pipe.
if we start polling each of the pipes in
sequence, the server may get stuck on the first
pipe (first client), waiting for data.
other clients may, however, issued a request
that could be processed instead.
The server should be able to examine each
file descriptor associated with each pipe to
determine if data is available.
41. 1: Introduction
Using the select() call
The prototype of select() is:
int select(int nfds, fd_set *readset, fd_set *writeset,
fd_set errorset, timeval *timeout )
ndfs tells select how many file descriptors are of
interest
readset, writeset, and errorset are bit maps
(binary words) in which each bit represents a
particular file descriptor.
timeout tells select() whether to block and wait
and if waiting is required timeout explicitly
specifies how long
42. 1: Introduction
fd_set and associated
functions
Dealing with bit masks in C, C++, and UNIX makes
programs less portable.
In addition, it is difficult to deal with individual
bits.
Hence, the abstraction fd_set is available along
with macros (functions on bit masks).
Available macros are:
void FD_ZERO(fd_set *fdset) resets all the bits in
fdset to 0
void FD_SET(int fd, fd_set *fdset) set the bit
representing fd to 1
int FD_ISSET(int fd, fd_set *fdset) returns 1 if the fd
bit is set
void FD_CLR(int fd, fd_set *fdset) turn of the bit fd in
fdset
43. 1: Introduction
a short example
#include .....
int fd1, fd2;
fd_set readset;
fd1 = open (“file1”, O_READONLY);
fd2 = open (“file2”, O_READONLY);
FD_ZERO(&readset);
FD_SET(fd1, &readset);
FD_SET(fd2, &readset);
select (5, &readset, NULL, NULL, NULL)
..........
44. 1: Introduction
more select
The select() system call can be used on any file
descriptor and is particularly important for
network programming with sockets.
One important note: when select returns it
modifies the bit mask according to the state of
the file descriptors.
You should save a copy of your original bit mask if
you execute the select() multiple times.