Distributed deadlock


Published on

Published in: Technology

Distributed deadlock

  1. 1. DISTRIBUTED DEADLOCKAbstract:A deadlock is a condition in a system where a process cannot proceed because it needs toobtain a resource held by another process but it itself is holding a resource that the otherprocess needs. In a system of processes which communicate only with a single centralagent, deadlock can be detected easily because the central agent has complete informationabout every process. Deadlock detection is more difficult in systems where there is no suchcentral agent and processes may communicate directly with one another. If we couldassume that message communication is instantaneous, or if we could place certainrestrictions on message delays, deadlock detection would become simpler. However, theonly realistic general assumption we can make is that message delays are arbitrary butfinite. Deadlock is a fundamental problem in distributed systems. A process may requestresources in any order, which may not be known a priori and a process can requestresource while holding others. If the sequence of the allocations of resources to theprocesses is not controlled, deadlocks can occur. Moreover, a deadlock is a state where aset of processes request resources that are held by other processes in the set.Types of Deadlock:Two types of deadlock can be considered:  Communication Deadlock  Resource DeadlockCommunication deadlock occurs when process A is trying to send a message to process B,which is trying to send a message to process C which is trying to send a message to A.Resource deadlock occurs when processes are trying to get exclusive access to devices,files, locks, servers, or other resources. We will not differentiate between these types ofdeadlock since we can consider communication channels to be resources without loss ofgenerality.Conditions for Deadlock:Four conditions have to be met for a deadlock to occur in a system:1. Mutual exclusionA resource can be held by at most one process.2. Hold and waitProcesses that already hold resources can wait for another resource.1|Page
  2. 2. DISTRIBUTED DEADLOCK3. Non-preemptionA resource, once granted, cannot be taken away.4. Circular waitTwo or more processes are waiting for resources held by one of the other processes.Resource allocation can be represented by directed graphs:P1 R1 means that resource R1 is allocated to process P1.P1 R1 means that resource R1 is requested by process P1.Deadlock is present when the graph has cycles. An example is shown in Figure 1. Figure 1: Deadlock.Wait for Graph:The state of the system can be modeled by directed graph, called a wait for graph (WFG). Ina WFG, nodes are processes and there is a directed edge from node P1 to mode P2 if P1 isblocked and is waiting for P2 to release some resource. A system is deadlocked if and onlyif there exists a directed cycle or knot in the WFG.AFigure 1 shows a WFG, where process P11 of site 1 has an edge to process P21 of site 1 andP32 of site 2 is waiting for a resource which is currently held by process P21. At the sametime process P32 is waiting on process P33 to release a resource. If P21 is waiting onprocess P11, then processes P11, P32 and P21 form a cycle and all the four processes areinvolved in a deadlock depending upon the request model.2|Page
  3. 3. DISTRIBUTED DEADLOCK Figure 2: A Wait-for-Graph.Handling Deadlocks in Distributed Systems:Deadlocks in distributed systems are similar to deadlocks in centralized systems. Incentralized systems, we have one operating system that can oversee resource allocationand know whether deadlocks are present. With distributed processes and resources itbecomes harder to detect, avoid, and prevent deadlocks. Several strategies can be used tohandle deadlocks:Ignore: We can ignore the problem. This is one of the most popular solutions.Detect: We can allow deadlocks to occur, then detect that we have a deadlock in thesystem, and then deal with the deadlock.Prevent: We can place constraints on resource allocation to make deadlocks impossible.Avoid: We can choose resource allocation carefully and make deadlocks impossible.3|Page
  4. 4. DISTRIBUTED DEADLOCKDeadlock avoidance is never used (either in distributed or centralized systems). Theproblem with deadlock avoidance is that the algorithm will need to know resource usagerequirements in advance so as to schedule them properly. Whereas, the first of these istrivially simple. The other two are described in details:Deadlock Detection:General methods for preventing or avoiding deadlocks can be difficult to find. Detecting adeadlock condition is generally easier. When a deadlock is detected, it has to be broken.This is traditionally done by killing one or more processes that contribute to the deadlock.Unfortunately, this can lead to annoyed users. When a deadlock is detected in a system thatis based on atomic transactions, it is resolved by aborting one or more transactions. Buttransactions have been designed to withstand being aborted. Consequences of killing aprocess in a transactional system are less severe.Centralized:Centralized deadlock detection attempts to imitate the nondistributed algorithm through acentral coordinator. Each machine is responsible for maintaining a resource graph for itsprocesses and resources. A central coordinator maintains the resource utilization graphfor the entire system. This graph is the union of the individual graphs. If this coordinatordetects a cycle, it kills off one process to break the deadlock. In the non-distributed case, allthe information on resource usage lives on one system and the graph may be constructedon that system. In the distributed case, the individual sub graphs have to be propagated to acentral coordinator. A message can be sent each time an arc is added or deleted. Ifoptimization is needed, a list of added or deleted arcs can be sent periodically to reduce theoverall number of messages sent.Here is an example. Suppose machine A has a process P0, which holds the resource S andwants resource R, which is held by P1. The local graph on A is shown in Figure 2. Anothermachine, machine B has a process P2, which is holding resource T and wants resource S. Itslocal graph is shown in Figure 3. Both of these machines send their graphs to the centralcoordinator, which maintains the union (Figure 3).All is well. There are no cycles and hence no deadlocks. Now two events occur. Process P1releases resource R and asks machine B for resource T. Two messages are sent to thecoordinator:Message 1 (from machine A): “releasing R”Message 2 (from machine B): “waiting for T”This should cause no problems (no deadlock). However, if message 2 arrives first, thecoordinator would then construct the graph in Figure 4 and detect a deadlock. Such acondition is known as false deadlock. A way to fix this is to use Lamport’s algorithm toimpose global time ordering on all machines. Alternatively, if the coordinator suspectsdeadlock, it can send a reliable message to every machine asking whether it has any release4|Page
  5. 5. DISTRIBUTED DEADLOCKmessages. Each machine will then respond with either a release message or a negativeacknowledgement to acknowledge receipt of the message. Figure 3: Centralized Deadlock Detection. Figure 4: False Deadlock.5|Page
  6. 6. DISTRIBUTED DEADLOCKDistributed:An algorithm for detecting deadlocks in a distributed system was proposed by Chaudy,Misra, and Haas in 1983. It allows that processes to request multiple resources at once (thisspeeds up the growing phase). Some processes may wait for resources (either local orremote). Cross-machine arcs make looking for cycles (detecting deadlock) hard. Thealgorithm works this way: When a process has to wait for a resource, a probe message issent to the process holding the resource. The probe message contains three components:the process that blocked, the process that is sending the request, and the destination.Initially, the first two components will be the same. When a process receives the probe: ifthe process itself is waiting on a resource, it updates the sending and destination fields ofthe message and forwards it to the resource holder. If it is waiting on multiple resources, amessage is sent to each process holding the resources. This process continues as long asprocesses are waiting for resources. If the originator gets a message and sees its ownprocess number in the blocked field of the message, it knows that a cycle has been takenand deadlock exists. In this case, some process (transaction) will have to die. The sendermay choose to commit suicide or a ring election algorithm may be used to determine analternate victim (e.g., youngest process, oldest process).Deadlock preventionAn alternative to detecting deadlocks is to design a system so that deadlock is impossible.One way of accomplishing this is to obtain a global timestamp for every transaction (so thatno two transactions get the same timestamp). When one process is about to block waitingfor a resource that another process is using, check which of the two processes has ayounger timestamp and give priority to the older process.If a younger process is using the resource, then the older process (that wants the resource)waits. If an older process is holding the resource, the younger process (that wants theresource) kills itself. This forces the resource utilization graph to be directed from older toyounger processes, making cycles impossible. This algorithm is known as the wait-diealgorithm (Figure 5).An alternative method by which resource request cycles may be avoided is to have an oldprocess preempt (kill) the younger process that holds a resource. If a younger processwants a resource that an older one is using, then it waits until the old process is done. Inthis case, the graph flows from young to old and cycles are again impossible. This variant iscalled the wound-wait algorithm (Figure 6).6|Page
  7. 7. DISTRIBUTED DEADLOCK Figure 5: Wait-die Algorithm. Figure 6: Wound-wait Algorithm.7|Page
  8. 8. DISTRIBUTED DEADLOCKConclusion:Deadlock is the state of permanent blocking of a set of processes each of which is waitingfor an event that only another process in the set can cause. However, handling of deadlocksin distributed systems is more complex than in centralized systems because the resources,the processes and other relevant information are scattered on different nodes of thesystem.……………………………………………………………….X……………………………………………………………………8|Page