2. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 2
1.Differentiate between tightly coupled and loosely
coupled systems
Computer architectures consisting of interconnected, multiple processors are basically of
two types:
In tightly coupled systems, there is a single system wide primary memory (address
space) that is shared by all the processors (Fig. 1.1). If any processor writes, for example,
the value 100 to the memory location x, any other processor subsequently reading from
location x will get the value 100. Therefore, in these systems, any communication between
the processors usually takes place through the shared memory.
In loosely coupled systems, the processors do not share memory, and each processor
has its own local memory (Fig. 1.2). If a processor writes the value 100 to the memory
location x, this write operation will only change the contents of its local memory and will not
affect the contents of the memory of any other processor. Hence, if another processor reads
the memory location x, it will get whatever value was there before in that location of its own
local memory. In these systems, all physical communication between the processors is done
by passing messages across the network that interconnects the processors.
Usually, tightly coupled systems are referred to as parallel processing systems, and loosely
coupled systems are referred to as distributed computing systems, or simply distributed
systems. In contrast to the tightly coupled systems, the processors of distributed computing
systems can be located far from each other to cover a wider geographical area.
Furthermore, in tightly coupled systems, the number of processors that can be usefully
deployed is usually small and limited by the bandwidth of the shared memory. This is not
the case with distributed computing systems that are more freely expandable and can have
an almost unlimited number of processors.
Tightly Coupled Multiprocessor Systems
3. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 3
Loosely Coupled Multiprocessor Systems
Hence, a distributed computing system is basically a collection of processors
interconnected by a communication network in which each processor has its own local
memory and other peripherals, and the communication between any two processors of the
system takes place by message passing over the communication network. For a particular
processor, its own resources are local, whereas the other processors and their resources are
remote. Together, a processor and its resources are usually referred to as a node or site or
machine of the distributed computing system.
2.
Describe about Buffering. What are the four types
of buffering strategies?
The transmission of messages from one process to another can be done by copying the
body of the message from the sender’s address space to the receiver’s address space. In
some cases, the receiving process may not be ready to receive the message but it wants
the operating system to save that message for later reception. In such cases, the operating
system would rely on the receiver’s buffer space in which the transmitted messages can be
stored prior to receiving process executing specific code to receive the message.
The synchronous and asynchronous modes of communication correspond to the two
extremes of buffering: a null buffer, or no buffering, and a buffer with unbounded capacity.
Two other commonly used buffering strategies are
single-message and finite-bound, or multiple message buffers. These four types of buffering
strategies are given below:
No buffering: In this case, message remains in the sender’s address space until the
receiver executes the corresponding receive.
4. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 4
Single message buffer: A buffer to hold a single message at the receiver side is
used. It is used for implementing synchronous communication because in this case
an application can have only one outstanding message at any given time.
Unbounded - Capacity buffer: Convenient to support asynchronous
communication. However, it is impossible to support unbounded buffer.
Finite-Bound Buffer: Used for supporting asynchronous communication.
3.
Define DSM. Discuss any four design and
implementation issues of DSM.
This is also called DSVM (Distributed Shared Virtual Memory). It is a loosely coupled
distributed-memory system that has implemented a software layer on top of the
message passing system to provide a shared memory abstraction for the programmers.
The software layer can be implemented in the OS kernel or in runtime library routines
with proper kernel support. It is an abstraction that integrates local memory of different
machines in a network environment into a single logical entity shared by cooperating
processes executing on multiple sites. Shared memory exists only virtually.
DSM Systems: A comparison between message passing and tightly coupled
multiprocessor systems
DSM provides a simpler abstraction than the message passing model. It relieves the burden
from the programmer from explicitly using communication primitives in their programs.
In message passing systems, passing complex data structures between two different
processes is difficult. Moreover, passing data structures containing pointers is generally
expensive in message passing model.
Distributed Shared Memory takes advantage of the locality of reference exhibited by
programs and improves efficiency.
Distributed Shared Memory systems are cheaper to build than tightly coupled
multiprocessor systems.
The large physical memory available facilitates running programs requiring large
memory efficiently.
DSM can scale well when compared to tightly coupled multiprocessor systems.
Message passing system allows processes to communicate with each other while being
protected from one another by having private address spaces, whereas in DSM one can
cause another to fail by erroneously altering data.
When message passing is used between heterogeneous computers marshaling of data takes
care of differences in data representation; how can memory be shared between computers
with different integer representation.
DSM can be made persistent - i.e. processes communicating via DSM may execute with
overlapping lifetimes.
A process can leave information in an agreed location to another process. Processes
communicating via message passing must execute at the same time.
Which is better? Message passing or Distributed Shared Memory? Distributed Shared
Memory appears to be a promising tool if it can be implemented efficiently.
Distributed Shared Memory Architecture
5. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 5
As shown in the above figure, the DSM provides a virtual address space shared among
processes on loosely coupled processors. DSM is basically an abstraction that integrates
the local memory of different machines in a network environment into a single local
entity shared by cooperating processes executing on multiple sites. The shared memory
itself exists only virtually. The application programs can use it in the same way as
traditional virtual memory, except that processes using it can run on different machines
in parallel.
DSM – Design and Implementation Issues
The important issues involved in the design and implementation of DSM systems are as
follows:
Granularity: It refers to the block size of the DSM system, i.e. to the units of sharing and
the unit of data transfer across the network when a network block fault occurs. Possible
units are a few words, a page, or a few pages.
Structure of Shared Memory Space: The structure refers to the Lay out of the shared
data in memory. It is dependent on the type of applications that the DSM system is
intended to support.
Memory coherence and access synchronization: Coherence (consistency) refers to
memory coherence problem that deals with the consistency of shared data that lies in the
main memory of two or more nodes. Synchronization refers to synchronization of
concurrent access to shared data using synchronization primitives such as semaphores.
Data Location and Access: A DSM system must implement mechanisms to locate data
blocks in order to service the network data block faults to meet the requirements of the
memory coherence semantics being used.
Block Replacement Policy: If the local memory of a node is full, a cache miss at that
node implies not only a fetch of the accessed data block from a remote node but also a
6. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 6
replacement. i.e. a data block of the local memory must be replaced by the new data
block. Therefore a block replacement policy is also necessary in the design of a DSM
system.
Thrashing: In a DSM system, data blocks migrate between nodes on demand. If two nodes
compete for write access to a single data item, the corresponding data block may be
transferred back and forth at such a high rate that no real work can get done. A DSM
system must use a policy to avoid this situation (known as Thrashing).
Heterogeneity: The DSM systems built in for homogenous systems need not address
the heterogeneity issue. However, if the underlying system environment is
heterogeneous, the DSM system must be designed to take care of heterogeneity so that
it functions properly with machines having different architectures.
4. Discuss any five features of good global scheduling algorithm
No a priori Knowledge about the process: A good process scheduling algorithm
should operate with absolutely no a priori knowledge about the processes.
ii) Dynamic in Nature: It is intended that a good process-scheduling algorithm
should be able to take care of the dynamically changing load at various nodes.
The process assignment decisions should be based on the current load of the
system and not on some fixed static policy.
iii) Quick Decision Making: A good process scheduling algorithm must be
capable of taking quick decisions regarding node assignment for processes.
iv) Scheduling overhead: The general observation is that as overhead is
increased in an attempt to obtain more information regarding the global state
of the system, the usefulness of the information is decreased due to both the
aging of the information gathered and the low scheduling frequency as a result
of the cost of gathering and processing that information. Hence algorithms that
provide near optimal system performance with a minimum of global state
information gathering overhead are desirable.
Stability: The algorithm should be stable: i.e., the system should not enter a state in which
nodes spend all their time migrating processes or exchanging control messages without
doing any useful work.
vi) Scalable: The algorithm should be scalable i.e. the system should be able to handle
small and large networked systems. A simple approach to make an algorithm scalable is to
probe only m of N nodes for selecting a host. The value of m can be dynamically adjusted
depending on the value of N.
vii) Fault Tolerance: The algorithm should not be affected by the crash of one or more
nodes in the system. At any instance of time, it should continue functioning for nodes that
are up at that time. Algorithms that have decentralized decision making a capability and
7. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 7
consider only available nodes in their decision making approach have better fault tolerance
capability.
viii) Fairness of service: How fairly a service is allocated is a common concern. For
example, two users simultaneously initiating equivalent processes should receive the same
quality of service. What is desirable is a fair strategy that will improve response time to the
former without unduly affecting the latter. For this the concept of load balancing has to be
replaced by load-sharing, i.e., a node will share some of its resources as long as its users
are not significantly affected.
5.What is replication? Discuss the three replication approaches in DFS
The main approach to improving the performance and fault tolerance of a DFS is to replicate
its content. A replicating DFS maintains multiple copies of files on different servers. This can
prevent data loss, protect a system against down time of a single server, and distribute the
overall workload.
There are three approaches to replication in a DFS:
1. Explicit replication: The client explicitly writes files to multiple servers. This approach
requires explicit support from the client and does not provide transparency.
2. Lazy file replication: The server automatically copies files to other servers after the
files are written. Remote files are only brought up to date when the files are sent to the
server. How often this happens is up to the implementation and affects the consistency of
the file state.
3. Group file replication: write requests are simultaneously sent to a group of servers.
This keeps all the replicas up to date, and allows clients to read consistent file state from
any replica.
6.
List and explain the desirable features of good naming
system
A good naming system for a distributed system should have the following features:
i) Location transparency
The name of an object should not reveal any hint about the physical location of the object
ii) Location independency
Name of an object should not be required to be changed when the object’s location
changes. Thus
8. Advanced Operating Systems (Distributed Systems)
M.C.A vth sem Page 8
physical location
iii) Scalability
Naming system should be able to handle the dynamically changing scale of a distributed
system
iv) Uniform naming convention
Should use the same naming conventions for all types of objects in the system
v) Multiple user-defined names for the same object
Naming system should provide the flexibility to assign multiple user-defined names for the
same object.
vi) Grouping name
Naming system should allow many different objects to be identified by the same name.
vii) Meaningful names
A naming system should support at least two levels of subject identifiers, one convenient
for human users and the other convenient for machines.