Static analysis and ROI


Published on

I regularly communicate with potential users who are worried about errors in C++ programs. Their worry is expressed in the following way: they try the PVS-Studio tool and start to write that it finds too few errors during tests. And although we feel that they find the tool interesting, still they their reaction is quite skeptical.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Static analysis and ROI

  1. 1. Static analysis and ROIAuthor: Andrey KarpovDate: 06.06.2011I regularly communicate with potential users who are worried about errors in C++ programs. Their worryis expressed in the following way: they try the PVS-Studio tool and start to write that it finds too fewerrors during tests. And although we feel that they find the tool interesting, still they their reaction isquite skeptical.Then a discussion starts where I try to convince them that PVS-Studio is a very good and useful productand that it could be very profitable to their company. In response they criticize my explanations andmake caustic remarks on the analyzers work and false alarms it produces. Usual marketing work it is.While communicating with one of these users I wrote a detailed answer and my opponent suggestedthat I arrange it as an article. And this is what I am doing. We are waiting for your comments on theestimate of a profit static code analysis tools may bring. Although I wrote the article keeping PVS-Studioin mind, the calculations given in it seem to be interesting regardless of what static analysis tool is underdiscussion.The text cited below is an answer to the following fragment of a letter:...About 40 (forty) more real defects have been found - in most cases, this is a bad copy-paste.The question is: what will we get from integration of an expensive program into the process ofdeveloping a software product in code of which the program detects so few defects? Yes, I understandthat we will find a fresh error quicker, but there are not so many fresh errors....So, lets have a look at static analysis tools from the viewpoint of ROI.Lets take an average programmer who spends most of his working time on developing C++ software. Itis easy for me to imagine such a person since I myself have been programming a lot for a long time.Suppose that he runs a static code analyzer at night. Also suppose that the analyzer, being used in thisworking mode and at a medium programming rate, can find two defects in code made by theprogrammer in a week.This is not abstract reasoning - I tell this relying on my own experience. I am handling the code only withhalf of usual effort now but almost every week I notice a mistake in my own code thanks to nightanalysis. Usually it is some trifle that would reveal itself when writing a test or running regression testsbut sometimes I find really serious things. Here is a sample of a defect PVS-Studio has found in my owncode quite recently:
  2. 2. bool staticSpecification = IsStaticSpecification(sspec);bool virtualSpecification = IsVirtualSpecification(sspec);bool externSpecification = IsVirtualSpecification(sspec);The fact that I write articles about the harm Copy-Paste does in no way prevents myself from makingsuch mistakes. I am human too and I copy code fragments and make mistakes too. It is very difficult tocatch an error like the one shown above. In practice it would cause the analyzer to generate a falsealarm on some code constructs in certain cases. I would hardly manage to create a manual test for sucha rare situation (to be exact, I did fail to create this test since I had put this code into SVN). What isinsidious about this error is that if some user complained about it, I would have to search for it at leastfor a couple of hours and also ask the user to send me the *.i file. But lets not get distracted.If the programmer writes code more regularly than me, 2 real warnings generated by the analyzerduring a week is a natural quantity. Altogether the analyzer can produce 2*4*11 = 88 actual warningsduring a year. We could neglect the time needed to fix such defects and suppress false alarms. But stilllets take it into account.Suppose the programmer spends 20 minutes in a week to fix 2 real errors and 2 false alarms. Altogetherhe will spend 20*4*11 = 880 minutes in a year on handling the analyzer. In other words, it is 15 hours.Does it seem a large waste of time? It is very little in comparison to what we will calculate further.Now lets consider the price of eliminating the same defects in case the analyzer does not detect themduring night tests.The programmer will find 60% of these errors himself a bit later while writing unit-tests, duringindividual preliminary testing or the process of debugging other code fragments. Lets say that thesearch of an error itself and fixing it will take about 15 minutes in this case since the person is handling arecently written code and knows it well. For example, the programmer might find a text in somedialogue that should not be there and find out that yesterday he used x.empty() instead of x.clear() byaccidence:url_.empty();if (status_text) url_ = status_text;And do not tell me that fixing such errors takes only 1-2 minutes. A correction itself takes severalseconds at all. But you have to find the necessary fragment, compile the fixed code, check if yourcorrection is right and probably introduce corrections into SVN. So lets say its 15 minutes.I would like to note right away that errors of this kind are usually fixed by programmers mechanicallyand are not considered errors usually because they are not recorded anywhere.35% of errors will be found at the testing stage. These errors have a longer life cycle. In the beginning, atester locates and recalls an issue. Then he makes a description of the error and places it into the bug-
  3. 3. tracker. The programmer finds and fixes the error and asks the tester to check this fragment once againand close the error. The total time spent by the tester and programmer together is about 2 hours. Hereyou are an example of such an error: incorrect handling of OPENFILENAME. The programmer might belucky and he will not see the rubbish in the dialogue while the tester will, yet not every time(Heisenbug):OPENFILENAME info = {0}; = L"*.txt";We have 5% of errors left unnoticed. That is, programmers and QA-engineers cannot find them but astatic code analyzer can.If you take your current project and check it with PVS-Studio or some other static analyzer, you will seethat very unnoticed 5% of errors. This 5% is those very 40 errors the potential user has mentioned whiletrying PVS-Studio.The rest 95% of errors were fixed by yourself earlier while writing tests, using unit-testing, manualtesting and other methods.So, we have 5% of errors we cannot find and they are hidden in the product we are releasing. 4% ofthem might never occur at all and we may ignore them. The remaining 1% of errors might reveal itselfunexpectedly by the users side and cause him a lot of troubles. For instance, a client wants to write aplugin to your system and the program crashes because of this code:bool ChromeFrameNPAPI::Invoke(...){ ChromeFrameNPAPI* plugin_instance = ChromeFrameInstanceFromNPObject(header); if (!plugin_instance && (plugin_instance->automation_client_.get())) return false;You never do that and always check external interfaces? Good guys. But Google Chromium failed here.So never make such promises.If you value your client, you will have to spend many hours on finding the defect and corresponding withthe client. After that you will have to additionally make a fix for him or release the next version ahead oftime. You might easily spend 40 hours of working time of various people (not to speak of their nerves)on such errors.What? Who said its not true? You have never wasted a whole week on one insidious bug? Then youhave never been a true programmer. :)
  4. 4. Lets calculate how much time we could save during a year:88 * 0.60 * 15 = 792 minutes88 * 0.35 * 120 = 3696 minutes88 * 0.01 * 40 * 60 = 2212 minutesAltogether the programmer spends (792 + 3696 + 2212) / 60 = 112 hours during one year to fix somesubset of his own errors.A team of 5 persons will spend about 560 hours or 70 days during a year on their own mistakes. Takinginto account paid weekends, vacations and sick-leaves we can say it is about 4 months of work for someabstract person.If it is profitable to use a static analyzer or not depends upon the salary of your employees.Since we speak about some abstract person (not only programmers are participating), lets take a salaryof $6000 per month. Taking into account salary taxes, rent, computer purchase and depreciation,bonuses, Internet, juice, etc., we can easily increase this number twice at least.So we get that the price of fixing errors (not all errors, but most of them) without using static analysis is$ 12 000 * 4 = $ 48 000.If we find the same errors quickly using static code analysis, the price of fixing them will be 5 * (15 / 8) *$ 12 000 / 20 = $ 5 600.Lets add the price of purchasing the PVS-Studio license for a team of 5 persons to this figure.The final price of fixing errors using a static analyzer will be (3 500 EUR) * 1.4 + 5600 = $ 10 500.Altogether the pure annual PROFIT from using the PVS-Studio static analyzer in a team of 5programmers is:$ 48 000 - $ 10 500 = $ 37 500.The price of fixing some errors has decreased more than three times. Its up to you to think and decide ifyou should have this or not...Yes, I also would like to note that I proceeded from rather conservative figures in my estimates. ActuallyI think that investments will be repaid much better. I just wanted to show you that you will get profit
  5. 5. even at the most conservative estimates. And please do not try to reproach me for any figures sayingthat they are false. The article shows an approach to a quality profit estimate, not a quantitative one.