Prisoner’s Dilemma in the Software Industry


Published on

game theory, prisoner’s dilemma, non-zero-sum game, software industry, contract compliance, contract behaviour, NCR, BCP, data warehouse, cooperation, strategy, business strategy, last-round strategy

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Prisoner’s Dilemma in the Software Industry

  1. 1. Game Theory “Prisoner’s Dilemma in the Software Industry” by Peter Louis and Nikolas Kelaiditis November 2001 © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 1 of 9
  2. 2. Abstract: This paper shows that the Prisoner’s Dilemmas problem exists in real business situations. As an example, the software industry is depicted, namely the case of National Cash Register (NCR) developing software for BCP Telecomuni- cações, a Bell South company, in Sao Paulo. It will be demonstrated that both NCR and BCP have an incentive to defect in order to maximize their benefit in a single-phase game. It will become apparent, however, that this behavior, like in the classical Prisoner’s Dilemma, will lead to a Nash Equilibrium that is suboptimal for both parties and that it is in the best interest of NCR and BCP to cooperate, allowing them to achieve a mutually beneficial outcome. It will then be further demonstrated that a multi-phased project is a possible way of solving this problem (meaning to reduce the incentive to behave uncooperatively to a minimum). Here, each project phase corresponds to a round in a Prisoner’s Dilemma with the optimization of each phase, as a whole, providing the incentive to continue the iterative game with the “Tit For Tat” strategy being used by both parties to achieve the an optimal solution. Game Theory: We live in a society where people do not always act the right, moral way but sometimes tend to look after themselves and their own interests first. In some situations, persons seek to maximize their benefit at the expense of others. Therefore, there is little incentive to cooperate, to improve the outcome of both parties. However, cooperation does occur and is a feature of our society. Hence, how does cooperation develop when each individual has the incentive to be selfish and not cooperate? In game theory, cooperation is usually analyzed by way of a non-zero-sum game called the “Prisoner’s Dilemma”1 A "zero sum" game is a win-lose game such as tic-tac-toe. For every winner, there is a loser. In contrast, non-zero sum games allow for cooperation. 1 Axelrod 1984 © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 2 of 9
  3. 3. In a Prisoner’s Dilemma, the two players have the choice between two moves, either "cooperate" or "defect". The idea is the following: each player gains when both cooperate, but if only one of them cooperates, the other one (who defects) will gain more. If both defect, both lose (or gain very little) but not as much as the "cheated" cooperator whose cooperation is not returned. The classical Prisoner’s Dilemma describes a hypothetical situation whereby two criminals are arrested under the suspicion of having committed a crime together. However, the police do not have sufficient proof in order to convict them. The two prisoners are isolated from each other, and the police visit each of them offering the following: the one who offers evidence against the other one will be freed. If none of them accepts the offer, i.e. they are cooperating against the police, then both of them will get only a small sentence because they have insufficient proof. They both would gain. However, if one of them betrays the other by confessing to the police, the defector will gain more, since he will be set free; the one who remains silent, on the other hand, will receive the full sentence, as he did not assist the police, and there is adequate proof. If both betray, both will be sentenced, but less severely than if they had refused to talk. The dilemma occurs because each prisoner has a choice between only two options, but cannot make a good decision without being aware of what the other intends to do. Such a distribution of gains and losses appears to be natural in many situations, since the cooperator whose action is not reciprocated will lose resources to the defector, without either one being able to collect the additional gain coming from the synergy of their cooperation. In a non-zero-sum game, the player's interests are not completely opposed. Rather, the possibility of achieving a mutually beneficial outcome exists. The problem with the prisoner's dilemma is that if both decision-makers were entirely rational, they would not cooperate. Indeed, rational decision-making means that you make the optimal decision for you regardless of what the other actor decides. Assume the other one would defect, then it would be rational to defect as well: you will not gain anything, but if you do not defect, you will end up with a loss. Suppose the other one would cooperate, then you will gain, but not as © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 3 of 9
  4. 4. much as you would if you decided not to cooperate - so here too the rational choice is to defect. The problem is that if both actors are rational, both will defect, and none of them will gain anything, ending up in a so-called Nash Equilibrium2. Nevertheless, if both would "irrationally" decide to cooperate, both would gain. This paradox can be formulated more explicitly through the principle of suboptimization: optimizing the outcome for a subsystem will generally not optimize the outcome for the system as a whole. In the classical prisoner’s dilemma, the suboptimization for each of the prisoners separately is to betray the other one, but this leads to both of them being sentenced rather severely, while they might have gone away with a mild sentence if they had stayed silent. By playing a Prisoner’s Dilemma situation a repeated number of times, each party is given the possibility to adapt its decision to the counterpart’s one and thus reach a more optimal solution as the incentive for defecting is reduced. For instance, if we could play the classical Prisoner’s Dilemma a few times, we would try to figure out our counterpart’s strategy and use it to minimize our own total jail time. We could pursue several different strategies: • The Golden Rule - "Do unto others as you would have them do unto you." Always cooperate (don't confess). This rule would theoretically lead to the optimal outcome, but if our counterpart acts rationally, he will always defect, maximizing our jail time. • Tit For Tat - "Do unto others as they do unto you." Begin with a defection (confess) for the reasons described in the above section, but after that do whatever our counterpart did last. However, in many real-life situations (e.g. business), then to start by not confessing would be preferable. • Tit For Tat 3 - Almost the same as the Tit For Tat rule. The exception is that we are a little more forgiving. If the counterpart defects (confesses), we will forgive him about once every three times and cooperate the next 2 If there is a set of strategies with the property that no player can benefit by changing her strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute a Nash Equilibrium. © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 4 of 9
  5. 5. time anyway. As with the golden rule, the counterpart to our disadvantage may exploit this strategy. • The Iron Rule - "Do unto others as you wish, before they do it unto you." Always defect. Both parties tend to accumulate a large prison sentence. • The Random Rule - Randomly choose "confess" or "don't confess." This strategy is not likely to lead to an optimal outcome as it does not follow a pattern and thus does not allow a joint strategy. The Prisoner's Dilemma in Business Software development is an area where a Prisoner’s Dilemma relationship can exist, particularly when one party might be more dominant in the relationship than the other party. For example, consider a data warehouse project where NCR is looking to create the first major data warehouse in Brazil with the major mobile operator in Sao Paulo, BCP. The success of this project is critical to NCR’s fortunes in Brazil and NCR can use it as an example to campaign for other projects in the industry or other sectors. NCR intends to develop the data warehouse software for BCP in exchange for the payment as agreed by the contract. If by the end of the contract, NCR delivers substandard work or something that does not work according to specifications or its deliverables are late and BCP pays, then NCR has received its payment whilst BCP has not received the deliverables as agreed. BCP has been short-changed. On the other hand, if NCR meets its contract requirements and produces its deliverables on time and BCP refuses to pay, then BCP has received the agreed deliverables as per contract whilst NCR has received either nothing for its efforts or less than agreed. Defection by BCP might also include requests for additional functionality at the same price or ambiguous interpretations of deliverables different from that of the NCR insisting on rework, etc. In essence, the outcomes have been reversed. © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 5 of 9
  6. 6. However, if both NCR and BCP perform cooperatively, NCR delivers what it has agreed on time and BCP pays upon receipt of the deliverables, then both parties will benefit from the relationship. The extent of the benefit, however, is less if either party decided to take advantage of the other. For example, it will require more effort and money by NCR to produce the agreed deliverables than simply to produce poor or non-functional work for the same amount of money from BCP. If both NCR and BCP both break their agreement then they both lose with NCR not getting that foothold in Brazil while BCP gets to keep their money. If honour was a guarantee of contract conformity, then one could expect developers to produce all that was agreed, and on time. Similarly, companies, such as BCP, would pay the full contract price upon receipt of the deliverables. Unfortunately, this is not the case and the solution that NCR and BCP have employed to ensure the conformity of the other party and to minimize their risk, is the multi-phase software contract. Here, deliverables are specified for each critical stage during the project and payments are made based upon the completion of phases. This in turn, turns a one-turn prisoner’s dilemma into a multi-turn game whilst reducing the risks to each party at each phase. At each phase, each party can determine if the other party has fulfilled their part of the contract – has their been cooperation. For example, BCP can inspect load scripts and/or the contents of the data warehouse to ensure compliance before deciding whether to pay or not. Similarly, NCR can ensure the receipt of payment before deciding whether to commence the next phase. Yet, a well-drafted contract also includes provisions to protect the developer to restrict deliverables to what has been agreed and clearly specified. This reduces the client’s ability to defect by withholding acceptance. These provisions are usually required to enforce customer acceptance of the contract on the last round. Without a provision, a client could holdout for the inclusion of a modification or “extras” not previously thought of at the cost of the developer. Even with such a provision, where the number of rounds are known, a client might choose to defect on the last round if the client believes that he will not © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 6 of 9
  7. 7. have to deal with you again – hoping to keep the balance of the money. However, a well-drafted contract will limit this occurrence. As with NCR, it is usually not in the interest of developers to defect and defection is usually due to negligence not malicious intent. NCR’s prospects in Brazil, in part, are determined by the success of the project and for this reason they have ensured that experienced team members were included. Whereas BCP has less need to play by the rules and can easily find another developer willing to provide a data warehouse for such an important client. The most effective strategy for NCR to employ in an iterated prisoner’s dilemma is a simple “Tit For Tat” one – offering cooperation on the first move and then echoing (cooperation or defection) what BCP does on their last move. Here cooperation is rewarded with cooperation whilst defection is immediately punished with defection. However, NCR, like most developers who are looking to establish a position in a market or who are in an unbalanced relationship, do not apply this strategy. This can be due to the need to keep the account happy or the desire to have additional sales after the contract has been completed. This approach only encourages BCP to defect. Given this situation, NCR should be prepared to suspend work on a subsequent phase given defection by BCP on the current phase. This, however, should only be done once a review is performed to ensure that NCR was not at fault or that a misunderstanding did not occur that can be resolved without NCR’s defection. However, if NCR complied with the agreement on the current phase, then moving to the next phase in the presence of the defection of BCP is a sucker’s move. The IT project is for a known number of phases (therefore the associated prisoner’s dilemma is for a known number of iterations) and preventing BCP’s defection on the last round is a difficult problem. The uncertainties at the start of the project are almost gone as the phases have been completed and the minor issues resolved. BCP has an almost complete working data warehouse but can use the issue of minor bugs to withhold payment. The benefit of © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 7 of 9
  8. 8. retaining the final payment is greater to BCP than the benefit of maintaining a good relationship with NCR and the prospects of future business. Some strategies that NCR can use on the last round to encourage compliance by BCP are: Use a software activation key to allow full-functionality of the product. However, this must be clearly stated in the contract and its implications should be made to the BCP. The correct legal advice should be sought. In the contract, a clause can be added that only transfers ownership of the development once the final payment has been made. Withhold extras until final payment has been received, such as source code or link libraries that do no inhibit user acceptance testing or product use, solely new developments. Acceptance can defined narrowly to almost guarantee acceptance such that non-payment by BCP will almost certainly lead to legal action with the result being in the favour of NCR. The contract could also include additional services or a maintenance phase that is crucial to the effective operation of the warehouse ensuring that BCP makes the last payment, as it would be in their best interest to do so. As has been demonstrated, the implications of the Prisoner’s Dilemma problem for the IT industry can be reduced to a series of iterative games in order to reduce the risk of defection and achieve a more optimal total outcome. Nonetheless, in the last game, there remains an incentive to defect, which can be mitigated by employing one or more of the last-round strategies described above. The solution that has been outlined likely to be equally applicable to other industries where similar conditions or relationships occur © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 8 of 9
  9. 9. Exhibit 1: Classical Prisoner’s Dilemma Prisoner 2 Confess Don’t Confess Confess 5, 5 0 , 10 Prisoner 1 Don’t Confess 10 , 0 2,2 Confess / Confess constitutes a Nash Equilibrium. (the higher the numbers in this matrix, the stricter the sentence, i.e. the players will prefer a lower number to a higher one) Exhibit 2: NCR vs. BCP BCP Pay Don’t Pay Good Code 3, 3 0,5 NCR Bad Code 5,1 1,2 Deliver a bad code / Don’t pay constitutes a Nash Equilibrium. (the higher the numbers in this matrix, the higher the utility, i.e. here the players will prefer a higher number to a lower one. Furthermore, it is assumed that a bad code will have some utility for BCP, and that delivering a bad code is dominant over delivering a good one due to lower development cost) © 2001 Peter Louis & Nikolas Kelaiditis. All Rights Reserved Page 9 of 9