SlideShare a Scribd company logo
The Big Calculator Gone Crazy 
Author: Andrey Karpov 
Date: 05.09.2013 
In this article I'm going to discuss a problem few people think of. Computer simulation of various processes 
becomes more and more widespread. This technology is wonderful because it allows us to save time and 
materials which would be otherwise spent on senseless chemical, biological, physical and other kinds of 
experiments. A computer simulation model of a wing section flow may help significantly reduce the number 
of prototypes to be tested in a real wind tunnel. Numerical experiments are given more and more trust 
nowadays. However, dazzled by the triumph of computer simulation, nobody notices the problem of 
software complexity growth behind it. People treat computer and computer programs just as a means to 
obtain necessary results. I'm worried that very few know and care about the fact that software size growth 
leads to a non-linear growth of the number of software bugs. It's dangerous to exploit a computer treating it 
just as a big calculator. So, that's what I think - I need to share this idea with other people. 
The Big Calculator 
At first I intended to entitle this article something like "If programmers can't create medicines, why can 
medics create programs?" Take some imaginary programmer - he is not allowed to develop and prepare 
medicines. The reason is obvious: he doesn't have the necessary education for that. However, it's not that 
simple with programming. It may seem that an imaginary medic who has learned how to program will by 
default be a successful and useful programmer - especially given that a more or less acceptable 
programming skill is much easier to master than organic chemistry and principles of medicine preparation. 
Here lies a trap. A computer experiment requires as much care as a real one. Lab workers are taught to 
wash test tubes after experiments and make sure they are sterile. But few really care about the problem of 
some array accidentally remaining uninitialized.
Programmers are well aware that the more complex software is, the more complicated and subtle bugs 
occur in it. In other words, I'm speaking of non-linear growth of the number of bugs accompanying the code 
size growth. Programs performing chemical or any other scientific computations are far from being simple, 
aren't they? Here's where the danger is. It's OK that a medic-programmer makes mistakes. Any 
programmer, however skillful, makes them from time to time. What's not OK, people tend to trust these 
results more and more. You calculate something and go on with your business. 
Those engaged in programming as their professional activity know how dangerous this approach is. They 
know what an undefined behavior is and how a program may only pretend to work well. There are numbers 
of articles and books explaining how to correctly develop unit tests and ensure the correctness of 
computations. 
Such is the world of programmers. But the world of chemists/physicists/medics is not that way, I'm afraid. 
They never write a complex program - they simply don't think that way. They use the computer as if it were 
just a Big Calculator. This comparison was suggested by one of our readers. Let me quote his comment in 
full here, so that English-speaking readers could learn about it too, once the article is translated. 
I can tell you something on this subject from my own experience. Though being a professional programmer, I 
actually come of a family of physicists and have physics education. At the moment when I had to choose 
which university to enter, the call of blood was stronger than my faith in the bright future of IT. So, I entered 
a physics university, rather prestigious on the local scale, which in fact is a "kindergarten" supervised by a 
large research institute in my native city Nizhny Novgorod. People who know the subject will at once guess 
which research institute and which university I mean. 
While studying there, I quite naturally proved to be one of the best at programming (and mathematical 
methods of physical modeling in particular). It was at the same time I also figured out the following things: 
1. Physicists tend to view the computer as a large multi-functional calculator allowing you to draw a graph 
of Eta versus Theta with Gamma going to infinity. As one can naturally expect, they are mainly interested in 
the graph itself, not the program. 
2. As a consequence of the first fact, a programmer is not viewed as a profession. A programmer is just the 
guy who knows how to use the Big Calculator to draw the needed graph. They don't care which way it will be 
done - at all. Sorry, what did you say? Static analysis? Version control? Oh, come on, guys! C++ is the 
language of programmers; physicists write in FORTRAN! 
3. As a consequence of the previous fact, anyone who's going to devote his life to writing programs to do 
physical modeling, even all-purpose ones, even tough as hell ones, is but an appendix to the Big Calculator. 
He's not even a person - just a kind of... By the way, it was not only me treated in such a way by physicists (I 
was just an ordinary student, after all) - but even the best computer modeling specialist in the research 
institute who taught a computational methods course at our university and who, when I turned to him as my 
thesis adviser while writing my term paper, said to me almost straightforward, "They will despise you, so be 
prepared to tolerate that". 
I didn't want to tolerate that and after graduation left the computer modeling area for the field where 
programmers are not thought to be untermenschen. I hope this example will help you understand why
initiatives like introducing static analysis even into relatively large (about 20 or 30 developers) projects on 
computer modeling are a hopeless job. There simply might not be a person who knows what it is. And if such 
a person does happen to be in the team, they'll most likely trample him because they don't need no trendy 
programmer frills of yours. "We've been doing without them for a hundred years - and will do for more." 
Here's another story for those not bored yet. My father, though being a pensioner, still works at a very large 
defense engineering enterprise here, in Nyzhny Novgorod (it's the largest in our city and one of the largest 
ones across the country; again, those who know the subject will guess it ;) ). He's been programming in 
FORTRAN for his entire life. He started at the time when punched cards were in use. I don't blame him for not 
studying C++. It was already too late for him 10 years ago - and he's still goes on pretty well. However, there 
are certain security precautions at this enterprise 2/3 of the staff of which are engaged in programming one 
way or another: 
1. No Internet. At all. You need literature - you go to the library. Stack Overflow? What's that? If you need to 
send an e-mail, you have to submit a written request to the boss explaining whom and what for you want to 
send it. Only a few chosen ones can use the Internet "against a receipt". Thank God, they have an internal 
network at least. 
2. No administration rights on your computer. Perhaps this restriction makes sense for the white-collar mass, 
but I can't imagine a programmer feeling satisfied with it. 
3. (Doesn't relate to the subject; just an illustration.) You can't even bring a cell phone with an integrated 
camera (have you seen ones without it nowadays?). 
As a result, even young employees write code in FORTRAN, while really skilled programmers are very few. I 
know that for sure because I trained some guy of 25 whom my father had recommended as a promising 
programmer. 
Here's my verdict: they are stuck in the 80-s there. Even given that they have pretty good salaries, I wouldn't 
go there for the world. 
These are just two examples from the intellectual elite's life. I don't mean to discredit anyone - they do their 
job well enough, but my heart is bleeding as I watch which windmills my father has to fight sometimes. 
(Thank God, I've managed to persuade him to start using git recently.) No OOP in a million-line project, no 
static analysis - nothing. 
It's just has to do with the human's trait to be very conservative about areas which are not one's strong 
points. 
Ilja Mayzus. The original comment. 
The core of this story is the ideology of treating the computer as a Big Calculator. In that case, you don't 
need to know any more about it than its younger brother, the pocket calculator, deserves. And it's the way 
it is actually used in many areas. Let's digress for a while and have a look inside the world of physics. Let's 
see how another theory finds a confirmation. To do this, I'll again have to quote a large extract from Bryan 
Greene's book "The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate 
Theory" [1]:
We all huddled around Morrison's computer in the office he and I shared. Aspinwall told Morrison how to 
bring his program up on the screen and showed us the precise form for the required input. Morrison 
appropriately formatted the results we had generated the previous night, and we were set to go. 
The particular calculation we were performing amounts, roughly speaking, to determining the mass of a 
certain particle species—a specific vibrational pattern of a string—when moving through a universe whose 
Calabi-Yau component we had spent all fall identifying. We hoped, in line with the strategy discussed earlier, 
that this mass would agree identically with a similar calculation done on the Calabi-Yau shape emerging 
from the space-tearing flop transition. The latter was the relatively easy calculation, and we had completed 
it weeks before; the answer turned out to be 3, in the particular units we were using. Since we were now 
doing the purported mirror calculation numerically on a computer, we expected to get something extremely 
close to but not exactly 3, something like 3.000001 or 2.999999, with the tiny difference arising from 
rounding errors. 
Morrison sat at the computer with his finger hovering over the enter button. With the tension mounting he 
said, "Here goes," and set the calculation in motion. In a couple of seconds the computer returned its 
answer: 8.999999. My heart sank. Could it be that space-tearing flop transitions shatter the mirror relation, 
likely indicating that they cannot actually occur? Almost immediately, though, we all realized that something 
funny must be going on. If there was a real mismatch in the physics following from the two shapes, it was 
extremely unlikely that the computer calculation should yield an answer so close to a whole number. If our 
ideas were wrong, there was no reason in the world to expect anything but a random collection of digits. We 
had gotten a wrong answer, but one that suggested, perhaps, that we had just made some simple arithmetic 
error. Aspinwall and I went to the blackboard, and in a moment we found our mistake: we had dropped a 
factor of 3 in the "simpler" calculation we had done weeks before; the true result was 9. The computer 
answer was therefore just what we wanted. 
Of course, the after-the-fact agreement was only marginally convincing. When you know the answer you 
want, it is often all too easy to figure out a way of getting it. We needed to do another example. Having 
already written all of the necessary computer code, this was not hard to do. We calculated another particle 
mass on the upper Calabi-Yau shape, being careful this time to make no errors. We found the answer: 12. 
Once again, we huddled around the computer and set it on its way. Seconds later it returned 11.999999. 
Agreement. We had shown that the supposed mirror is the mirror, and hence space-tearing flop transitions 
are part of the physics of string theory. 
At this I jumped out of my chair and ran an unrestrained victory lap around the office. Morrison beamed 
from behind the computer. Aspinwall's reaction, though, was rather different. "That's great, but I knew it 
would work," he calmly said. "And where's my beer?" 
I truly believe they are geniuses. But let's imagine for a moment that it were some ordinary students who 
used this approach to calculate an integral. I don't think programmers would take it seriously then. And 
what if the program generated 3 right away? The bug would be taken as the final evidence? I think it would 
clear up later, during a re-check by themselves or their scientist colleagues. Still, the "ideal spherical 
programmer in vacuum" is scared to death by this fact. 
This is how things are in reality. It's not only personal computers used in such a way - it's also cluster 
systems exploited for scientific computations. And what's most scary, people trust the results produced by
programs. In the future we are going to deal with even more computations of this kind, and the price of 
having software bugs will become steeper too. 
Isn't it time to change something? 
Yes, nobody can forbid me to stick a plaster on a cut myself; I guess I can recommend some medicine to 
take when you've caught a cold. But not more than that. I cannot drill a tooth or write out a prescription. 
Don't you find it reasonable that developers creating a software system whose responsibility extends 
beyond certain scope should also confirm their skill? 
I know there exist various certifications. But I'm talking of a different thing now. Certification is intended to 
ensure that program code conforms to certain standards. It partially prevents slopwork, in an indirect way. 
But the range of areas where certification is a strict requirement is quite narrow. It obviously does not cover 
the whole set of areas and situations where careless use of the Big Calculator may do much harm. 
Example of the Danger 
I guess many of you find my worries too abstract. That's why I suggest examining a few real-life examples. 
There is the open-source package Trans-Proteomic Pipeline (TPP) designed to solve various tasks in biology. 
No doubt, it is used - by its developers and perhaps some third-party organizations. I believe that any bug in 
it is already a potential issue. And does it have bugs? Yes, it does; and still more are appearing. We checked 
this project one year ago and reported it in the blog-post "Analysis of the Trans-Proteomic Pipeline (TPP) 
project". 
What has changed since then? Nothing. The project is going on to develop and accumulate new bugs. The 
Big Calculator ideology has won. The developers are not writing a high-quality project with the minimum 
number of bugs possible. They simply solve their tasks; otherwise they would have reacted in some way to 
the last year article and considered introducing some static analysis tools. I don't mean they must 
necessarily choose PVS-Studio; there are numbers of other static code analyzers. The point is that their 
responsible application goes on collecting most trivial bugs. Let's see what fresh ones they've got. 
1. Some bungler goes on writing incorrect loops 
In the previous article I mentioned incorrect loop conditions. The new package version has them too. 
double SpectraSTPeakList::calcDot(SpectraSTPeakList* other) { 
.... 
for (i = this->m_bins->begin(), j = other->m_bins->begin(); 
i != this->m_bins->end(), j != other->m_bins->end(); 
i++, j++) { 
d = (*i) * (*j); 
dot += d; 
}
.... 
} 
PVS-Studio's diagnostic message: V521 Such expressions using the ',' operator are dangerous. Make sure the 
expression is correct. spectrastpeaklist.cpp 504 
In the check "i != this->m_bins->end(), j != other->m_bins->end()", the expression before the comma does 
not check anything. The ',' operator is used to execute expressions both to the right and to the left of it in 
the left-to-right order and return the value of the right expression. This is what the correct check should 
look like: 
i != this->m_bins->end() && j != other->m_bins->end() 
The same defect can be also found in the following fragments: 
• spectrastpeaklist.cpp 516 
• spectrastpeaklist.cpp 529 
• spectrastpeaklist.cpp 592 
• spectrastpeaklist.cpp 608 
• spectrastpeaklist.cpp 625 
• spectrastpeaklist.cpp 696 
2. Null pointer dereferencing 
This bug won't lead to outputting incorrect computation results - it will cause a crash instead, which is much 
better. However, it would be strange not to mention these bugs. 
void ASAPRatio_getDataStrctRatio(dataStrct *data, ....) 
{ 
.... 
int *outliers, *pepIndx=NULL; 
.... 
//pepIndx doesn't change 
.... 
if(data->dataCnts[i] == 1 && pepIndx[i] == 0) 
data->dataCnts[i] = 0; 
.... 
} 
PVS-Studio's diagnostic message: V522 Dereferencing of the null pointer 'pepIndx' might take place. 
asapcgidisplay2main.cxx 534
The same defect can be also found in the following fragments: 
• Pointer 'peptides'. asapcgidisplay2main.cxx 556 
• Pointer 'peptides'. asapcgidisplay2main.cxx 557 
• Pointer 'peptides'. asapcgidisplay2main.cxx 558 
• Pointer 'peptides'. asapcgidisplay2main.cxx 559 
• Pointer 'peptides'. asapcgidisplay2main.cxx 560 
• Pointer 'pepIndx'. asapcgidisplay2main.cxx 569 
3. Uncleared arrays 
static void clearTagNames() { 
std::vector<const char *>ptrs; 
for (tagname_set::iterator i = tagnames.begin(); 
i!=tagnames.end();i++) { 
ptrs.push_back(*i); 
} 
for (tagname_set::iterator j = attrnames.begin(); 
j!=attrnames.end();j++) { 
ptrs.push_back(*j); 
} 
tagnames.empty(); 
attrnames.empty(); 
for (size_t n=ptrs.size();n--;) { 
delete [] (char *)(ptrs[n]); // cast away const 
} 
} 
In this code the analyzer has caught two uncleared arrays at once: 
V530 The return value of function 'empty' is required to be utilized. tag.cxx 72 
V530 The return value of function 'empty' is required to be utilized. tag.cxx 73 
You should call the clear() function instead of empty(). 
4. Uninitialized class objects 
class ExperimentCycleRecord {
public: 
ExperimentCycleRecord() { 
ExperimentCycleRecord(0,0,0,True,False); 
} 
ExperimentCycleRecord(long lExperiment, long lCycleStart, 
long lCycleEnd, Boolean bSingleCycle, 
Boolean bRangleCycle) 
{ 
.... 
} 
.... 
} 
PVS-Studio's diagnostic message: V603 The object was created but it is not being used. If you wish to call 
constructor, 'this->ExperimentCycleRecord::ExperimentCycleRecord(....)' should be used. 
mascotconverter.cxx 101 
The ExperimentCycleRecord() constructor does not do what it is intended to; it doesn't initialize anything. 
The developer might be an excellent chemist, but if he doesn't know how to use the C++ language properly, 
his computations using uninitialized memory aren't worth a damn. It's like using a dirty test tube. 
Instead of calling another constructor, the line "ExperimentCycleRecord(0,0,0,True,False);" creates a 
temporary object which will be destroyed after that. This error pattern is discussed in detail in the article 
"Wade not in unknown waters. Part one". 
The same defect can be also found in the following fragments: 
• asapratiopeptideparser.cxx 57 
• asapratiopeptidecgidisplayparser.cxx 36 
• cruxdiscrimfunction.cxx 36 
• discrimvalmixturedistr.cxx 34 
• mascotdiscrimfunction.cxx 47 
• mascotscoreparser.cxx 37 
• tandemdiscrimfunction.cxx 35 
• tandemkscoredf.cxx 37 
• tandemnativedf.cxx 37 
5. Comments breaching execution logic 
int main(int argc, char** argv) {
.... 
if (getIsInteractiveMode()) 
//p->writePepSHTML(); 
//p->printResult(); 
// regression test? 
if (testType!=NO_TEST) { 
TagListComparator("InterProphetParser",testType, 
outfilename,testFileName); 
.... 
} 
PVS-Studio's diagnostic message: V628 It's possible that the line was commented out improperly, thus 
altering the program's operation logics. interprophetmain.cxx 175 
After the 'if' operator, a few lines executing some operations were commented out. As a result, the program 
logic changed quite differently than expected. The programmer didn't want any actions to be done after 
executing the condition. Instead, the 'if' operator affects the code below. As a consequence, the tests' 
output now depends not only on the "testType!=NO_TEST" condition, but on the "getIsInteractiveMode()" 
condition as well. That is, the test may not test anything. That's why I strongly recommend not to rely fully 
on one testing methodology alone (for example, TDD). 
6. Misprints 
Misprints are to be found everywhere and all the time. It's not that bad if you get fewer hit points after an 
explosion in a game than you ought to, because of such a bug. But what do incorrect data mean when 
computing chemical reactions? 
void ASAPRatio_getProDataStrct(proDataStrct *data, char **pepBofFiles) 
{ 
.... 
if (data->indx == -1) { 
data->ratio[0] = -2.; 
data->ratio[0] = 0.; 
data->inv_ratio[0] = -2.; 
data->inv_ratio[1] = 0.;
return; 
} 
.... 
} 
PVS-Studio's diagnostic message: V519 The 'data->ratio[0]' variable is assigned values twice successively. 
Perhaps this is a mistake. Check lines: 130, 131. asapcgidisplay2main.cxx 131 
One and the same variable is by mistake assigned two different values. The correct code is this one: 
data->ratio[0] = -2.; 
data->ratio[1] = 0.; 
This fragment was then copied-and-pasted into other parts of the program: 
• asapcgidisplay2main.cxx 338 
• asapcgidisplay2main.cxx 465 
• asapratioproteincgidisplayparser.cxx 393 
• asapratioproteincgidisplayparser.cxx 518 
7. Comparing signed and unsigned values 
Comparing signed and unsigned values properly requires some skill. Ordinary calculators don't deal with 
unsigned values, but the C++ language does. 
size_type size() const; 
void computeDegenWts() 
{ 
.... 
int have_cluster = 0; 
.... 
if ( have_cluster > 0 && ppw_ref.size() - have_cluster > 0 ) 
.... 
} 
PVS-Studio's diagnostic message: V555 The expression 'ppw_ref.size() - have_cluster > 0' will work as 
'ppw_ref.size() != have_cluster'. proteinprophet.cpp 6767 
The programmer wanted the check "ppw_ref.size() > have_cluster" to be executed. But he got something 
very different instead.
To make it clearer, let's assume we have the 'size_t' type which is 32-bit. Suppose the function 
"ppw_ref.size()" returns 10 while the variable have_cluster equals 15. The function ppw_ref.size() returns 
the unsigned type 'size_t'. According to the C++ rules, the right operand in the subtraction operation must 
also have the type 'size_t' before subtraction is executed. It's alright for now: we have 10u on the left and 
15u on the right. 
Here goes the subtraction: 
10u - 15u 
And here's where we get a trouble. Those very C++ rules tell us that the result of subtraction between two 
unsigned variables must also be unsigned. 
It means that 10u - 15u = FFFFFFFBu. As you know, 4294967291 is greater than 0. 
The Big Calculator Riot is successful. Writing a correct theoretical algorithm is only half of the job. You also 
need to write a correct code. 
A similar bug can be found in the following fragment: 
double SpectraSTPeakList::calcXCorr() { 
.... 
for (int tau = -75; tau <= 75; tau++) { 
float dot = 0.0; 
for (unsigned int b = 0; b < numBins; b++) { 
if (b + tau >= 0 && b + tau < (int)numBins) { 
dot += (*m_bins)[b] * theoBins[b + tau] / 10000.0; 
} 
} 
.... 
.... 
} 
PVS-Studio's diagnostic message: V547 Expression 'b + tau >= 0' is always true. Unsigned type value is 
always >= 0. spectrastpeaklist.cpp 2058 
As you can see, the variable 'tau' takes values within the range [-75, 75]. To avoid array overrun, the check b 
+ tau >= 0 is used. I guess you've already got the point that this check won't work. The variable 'b' has the
'unsigned' modifier. It means that the "b + tau" expression's result is unsigned too. And an unsigned value is 
always greater than or equal to 0. 
8. Strange loop 
const char* ResidueMass::getStdModResidues(....) { 
.... 
for (rmap::const_iterator i = p.first; i != p.second; ++i) { 
const cResidue &r = (*i).second; 
if (r.m_masses[0].m_nterm) { 
n_term_aa_mod = true; 
} else if (r.m_masses[0].m_cterm) { 
c_term_aa_mod = true; 
} 
return r.m_residue.c_str(); 
} 
if(! strcmp(mod, "+N-formyl-met (Protein)")) { 
return "n"; 
} if (! strcmp(mod, "13C6-15N2 (K)")) { 
return "K"; 
} if (! strcmp(mod, "13C6-15N4 (R)")) { 
return "R"; 
.... 
} 
PVS-Studio's diagnostic message: V612 An unconditional 'return' within a loop. residuemass.cxx 1442 
There is the 'return' operator inside the loop and it's called in any case. The loop can execute only once, 
after which the function terminates. It's either a misprint here or some condition is missing before the 
'return' operator. 
9. Rough calculations 
double RTCalculator::getUsedForGradientRate() { 
if (rts_.size() > 0)
return used_count_ / rts_.size(); 
return 0.; 
} 
PVS-Studio's diagnostic message: V636 The 'used_count_ / rts_.size()' expression was implicitly casted from 
'int' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An 
example: double A = (double)(X) / Y;. rtcalculator.cxx 6406 
Since the function returns values of the double type, I find it reasonable to suppose the following. 
When the variable 'used_count_' is assigned the value 5 and the function rts_.size() returns 7, the 
approximate result is 0,714. However, the function getUsedForGradientRate() will return 0 in this case. 
The variable 'used_count_' has the 'int' type. The rts_.size() function also returns an 'int' value. An integer 
division occurs, and the result is obvious: it's zero. Then zero is implicitly cast to double, but it doesn't 
matter at this point. 
To fix the defect, the code should be rewritten in the following way: 
return static_cast<double>(used_count_) / rts_.size(); 
Other defects of this kind: 
• cgi_pep3d_xml.cxx 3203 
• cgi_pep3d_xml.cxx 3204 
• asapratiopeptideparser.cxx 4108 
10. Great and mighty Copy-Paste 
The function setPepMaxProb() contains a few large similarly looking blocks. In this fragment one can feel 
that specific smell of the Copy-Paste technique. Using it naturally results in an error. I had to SIGNIFICANTLY 
abridge the sample text. The bug is very noticeable in the abridged code, but it's almost impossible to see it 
in the original code. Yeah, it's an advertisement of static analysis tools in general and PVS-Studio in 
particular. 
void setPepMaxProb( bool use_nsp, bool use_fpkm, 
bool use_joint_probs, bool compute_spectrum_cnts ) 
{ 
double prob = 0.0; 
double max2 = 0.0; 
double max3 = 0.0; 
double max4 = 0.0; 
double max5 = 0.0;
double max6 = 0.0; 
double max7 = 0.0; 
.... 
if ( pep3 ) { ... if ( use_joint_probs && prob > max3 ) ... } 
.... 
if ( pep4 ) { ... if ( use_joint_probs && prob > max4 ) ... } 
.... 
if ( pep5 ) { ... if ( use_joint_probs && prob > max5 ) ... } 
.... 
if ( pep6 ) { ... if ( use_joint_probs && prob > max6 ) ... } 
.... 
if ( pep7 ) { ... if ( use_joint_probs && prob > max6 ) ... } 
.... 
} 
V525 The code containing the collection of similar blocks. Check items 'max3', 'max4', 'max5', 'max6', 'max6' 
in lines 4664, 4690, 4716, 4743, 4770. proteinprophet.cpp 4664 
PVS-Studio's diagnostic message: V525 The code containing the collection of similar blocks. Check items 
'max3', 'max4', 'max5', 'max6', 'max6' in lines 4664, 4690, 4716, 4743, 4770. proteinprophet.cpp 4664 
Unfortunately, the V525 diagnostic produces many false positives and therefore referred to the third-level 
warnings. But if one overcomes one's laziness and study this class of warnings, one may find numbers of 
such nice bugs. 
11. Pointer is not initialized sometimes 
int main(int argc, char** argv) { 
.... 
ramp_fileoffset_t *pScanIndex; 
.... 
if ( (pFI=rampOpenFile(mzXmlPath_.c_str()))==NULL) { 
.... 
} else {
.... 
pScanIndex = readIndex(pFI, indexOffset, &iAnalysisLastScan); 
.... 
} 
.... 
if (pScanIndex != NULL) 
free(pScanIndex); 
return 0; 
} 
PVS-Studio's diagnostic message: V614 Potentially uninitialized pointer 'pScanIndex' used. sqt2xml.cxx 476 
This program may crash at the end if the function rampOpenFile() returns NULL. It's not critical yet 
unpleasant. 
Here's another variable that may remain uninitialized: 
• Potentially uninitialized pointer 'fp_' used. dta-xml.cpp 307 
12. Virtual destructor missing 
class DiscriminantFunction { 
public: 
DiscriminantFunction(int charge); 
virtual Boolean isComputable(SearchResult* result) = 0; 
virtual double getDiscriminantScore(SearchResult* result) = 0; 
virtual void error(int charge); 
protected: 
int charge_; 
double const_; 
}; // class 
class CometDiscrimFunction : public DiscriminantFunction;
class CruxDiscrimFunction : public DiscriminantFunction; 
class InspectDiscrimFunction : public DiscriminantFunction; 
..... 
class DiscrimValMixtureDistr : public MixtureDistr { 
.... 
DiscriminantFunction* discrim_func_; 
.... 
}; 
DiscrimValMixtureDistr::~DiscrimValMixtureDistr() { 
delete[] posinit_; 
delete[] neginit_; 
delete discrim_func_; 
} 
PVS-Studio's diagnostic message: V599 The virtual destructor is not present, although the 
'DiscriminantFunction' class contains virtual functions. discrimvalmixturedistr.cxx 206 
A number of classes are inherited from the DiscriminantFunction class. For example, such is the class 
DiscrimValMixtureDistr. Its destructor frees memory; therefore, it's very desirable that you call it. 
Unfortunately, the DiscriminantFunction class's destructor is not declared as a virtual one - with all the 
ensuing consequences. 
13. Miscellaneous 
There are numbers of small defects which won't have serious consequences but are still not very pleasant to 
have in your code. There are also strange fragments, but I can't say for sure if they are incorrect. Here's one 
of them: 
Boolean MixtureModel::iterate(int counter) { 
.... 
if (done_[charge] < 0) { 
done_[charge]; 
}
else if (priors_[charge] > 0.0) { 
done_[charge] += extraitrs_; 
} 
.... 
} 
PVS-Studio's diagnostic message: V607 Ownerless expression 'done_[charge]'. mixturemodel.cxx 1558 
What is it? Incomplete code? Or maybe the programmer just wanted to point it out that nothing should be 
done if the "done_[charge] < 0" condition is true? 
And here you are an incorrect way of freeing memory. Any critical consequences are unlikely, but still the 
code smells. 
string Field::getText(....) 
{ 
.... 
char* pepString = new char[peplen + 1]; 
.... 
delete pepString; 
.... 
} 
PVS-Studio's diagnostic message: V611 The memory was allocated using 'new T[]' operator but was released 
using the 'delete' operator. Consider inspecting this code. It's probably better to use 'delete [] pepString;'. 
pepxfield.cxx 1023 
The correct way of doing this is to write "delete [] pepString". There are many other defects of this kind: 
• cruxdiscrimvalmixturedistr.cxx 705 
• cruxdiscrimvalmixturedistr.cxx 715 
• mascotdiscrimvalmixturedistr.cxx 426 
• mascotdiscrimvalmixturedistr.cxx 550 
• mascotdiscrimvalmixturedistr.cxx 624 
• phenyxdiscrimvalmixturedistr.cxx 692 
• probiddiscrimvalmixturedistr.cxx 487 
• probiddiscrimvalmixturedistr.cxx 659 
• tandemdiscrimvalmixturedistr.cxx 731 
• tandemdiscrimvalmixturedistr.cxx 741
And here's an incorrect implementation of the "--" operator. It doesn't seem to be used anywhere, 
otherwise the bug would quickly reveal itself. 
CharIndexedVectorIterator operator++(int) 
{ // postincrement 
CharIndexedVectorIterator _Tmp = *this; 
++m_itr; 
return (_Tmp); 
} 
CharIndexedVectorIterator& operator--() 
{ // predecrement 
++m_itr; 
return (*this); 
} 
PVS-Studio's diagnostic message: V524 It is odd that the body of '--' function is fully equivalent to the body 
of '++' function. charindexedvector.hpp 81 
The operators "--" and "++" are implemented in the same way. They must have been copied-and-pasted 
then: 
• charindexedvector.hpp 87 
• charindexedvector.hpp 159 
• charindexedvector.hpp 165 
Let's stop here. It all is not very interesting, and the article is big enough. As usual, I'm urging the developers 
not to limit themselves to fixing only the mentioned defects. Download and check the project with PVS-Studio 
yourself. I could have missed many errors. We can even grant you a free registration key for some 
time. 
Summary 
Unfortunately, the article has appeared a bit tangled. What did the author want to say, after all? I'll try to 
repeat in a very brief form my ideas I want to share with you. 
1. We are currently using more and more programs to perform scientific and engineering 
computations and simulate various processes, and we grow to trust them. 
2. Programs get very complicated. Professional programmers understand it very well that one cannot 
approach the task of creating a software package for computer simulation in the same way as using
a software calculator. The growth of software complexity leads to an exponential increase of the 
number of errors [2]. 
3. It appears that physicists/biologists/medics cannot simply calculate something in the usual manner. 
One cannot ignore the software complexity increase and the consequences of incorrect 
computations arising from imperfect knowledge of a programming language. 
4. In this article I've given arguments to prove that this is the real state of things. The first quotation 
tells us that people tend to treat the computer as an ordinary calculator. The second quotation just 
reaffirms this idea. The error samples discussed after that are meant to demonstrate that people 
really make mistakes when treating computer simulation software in such a way. So, my anxiety has 
solid ground. 
So, what shall we do? 
First of all, I'd like you to realize this problem and tell your colleagues from related areas. It's been clear to 
programmers for a long time that the software complexity growth and silly mistakes in large projects may 
easily turn into a source of great harm. On the other hand, those people who treat programming and 
computers just as a tool don't know that and don't bother to think about it. So, we need to draw their 
attention to this problem. 
Here you are an analogy. Imagine a man who has got him a cudgel and starts hunting some animals. The 
cudgel in his hands gradually turns into a stone axe, then a sword, and finally a gun. But he still uses it just to 
stun hares by hitting them on the head. It's not only that this way of using the weapon is absolutely 
inefficient; it has also become much more dangerous now (he can accidentally shoot himself or his fellow 
men). Hunters from the "programmers" tribe quickly adapt themselves to these changes. The rest don't 
have time for that - they are busy hunting hares. After all, it's all about the hares. We need to tell these 
people that they have to learn, whether they like it or not. It'll improve everyone's life. And waving your gun 
around is no good. 
References 
1. Bryan Greene "The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the 
Ultimate Theory. ISBN 978-0375708114 
2. Andrey Karpov. Feelings confirmed by numbers. http://www.viva64.com/en/b/0158/

More Related Content

Viewers also liked

PVS-Studio Has Finally Got to Boost
PVS-Studio Has Finally Got to BoostPVS-Studio Has Finally Got to Boost
PVS-Studio Has Finally Got to Boost
Andrey Karpov
 
Firefox Easily Analyzed by PVS-Studio Standalone
Firefox Easily Analyzed by PVS-Studio StandaloneFirefox Easily Analyzed by PVS-Studio Standalone
Firefox Easily Analyzed by PVS-Studio Standalone
Andrey Karpov
 
Checking VirtualDub
Checking VirtualDubChecking VirtualDub
Checking VirtualDub
Andrey Karpov
 
Can We Trust the Libraries We Use?
Can We Trust the Libraries We Use?Can We Trust the Libraries We Use?
Can We Trust the Libraries We Use?
Andrey Karpov
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
Andrey Karpov
 
PVS-Studio documentation (version 4.54)
PVS-Studio documentation (version 4.54)PVS-Studio documentation (version 4.54)
PVS-Studio documentation (version 4.54)
Andrey Karpov
 
How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1
Andrey Karpov
 
Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...
Andrey Karpov
 

Viewers also liked (8)

PVS-Studio Has Finally Got to Boost
PVS-Studio Has Finally Got to BoostPVS-Studio Has Finally Got to Boost
PVS-Studio Has Finally Got to Boost
 
Firefox Easily Analyzed by PVS-Studio Standalone
Firefox Easily Analyzed by PVS-Studio StandaloneFirefox Easily Analyzed by PVS-Studio Standalone
Firefox Easily Analyzed by PVS-Studio Standalone
 
Checking VirtualDub
Checking VirtualDubChecking VirtualDub
Checking VirtualDub
 
Can We Trust the Libraries We Use?
Can We Trust the Libraries We Use?Can We Trust the Libraries We Use?
Can We Trust the Libraries We Use?
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
 
PVS-Studio documentation (version 4.54)
PVS-Studio documentation (version 4.54)PVS-Studio documentation (version 4.54)
PVS-Studio documentation (version 4.54)
 
How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1
 
Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...
 

Similar to The Big Calculator Gone Crazy

It is difficult
It is difficultIt is difficult
It is difficult
PVS-Studio
 
C programming guide new
C programming guide newC programming guide new
C programming guide new
Kuntal Bhowmick
 
Ai lecture1 final
Ai lecture1 finalAi lecture1 final
Ai lecture1 final
Shivam Agrawal
 
Flow based-1994
Flow based-1994Flow based-1994
Flow based-1994
getdownload
 
The year was 1967
The year was 1967The year was 1967
The year was 1967
John Thomas
 
Role Of Programmer On Telecom Industry
Role Of Programmer On Telecom IndustryRole Of Programmer On Telecom Industry
Role Of Programmer On Telecom Industry
Divya Watson
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway week
Christopher Decker
 
PPT slides - MACHINE PERCEPTION LABORATORY
PPT slides - MACHINE PERCEPTION LABORATORYPPT slides - MACHINE PERCEPTION LABORATORY
PPT slides - MACHINE PERCEPTION LABORATORYbutest
 
Notes on Simulation and GHDL
Notes on Simulation and GHDLNotes on Simulation and GHDL
Notes on Simulation and GHDLDIlawar Singh
 
Lecture 1 Slides -Introduction to algorithms.pdf
Lecture 1 Slides -Introduction to algorithms.pdfLecture 1 Slides -Introduction to algorithms.pdf
Lecture 1 Slides -Introduction to algorithms.pdf
RanvinuHewage
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
MoTechInc
 
Eid Short Essay In English
Eid Short Essay In EnglishEid Short Essay In English
Eid Short Essay In English
Adriana Lovato
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning
Max Pagels
 

Similar to The Big Calculator Gone Crazy (14)

It is difficult
It is difficultIt is difficult
It is difficult
 
CCFW.software
CCFW.softwareCCFW.software
CCFW.software
 
C programming guide new
C programming guide newC programming guide new
C programming guide new
 
Ai lecture1 final
Ai lecture1 finalAi lecture1 final
Ai lecture1 final
 
Flow based-1994
Flow based-1994Flow based-1994
Flow based-1994
 
The year was 1967
The year was 1967The year was 1967
The year was 1967
 
Role Of Programmer On Telecom Industry
Role Of Programmer On Telecom IndustryRole Of Programmer On Telecom Industry
Role Of Programmer On Telecom Industry
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway week
 
PPT slides - MACHINE PERCEPTION LABORATORY
PPT slides - MACHINE PERCEPTION LABORATORYPPT slides - MACHINE PERCEPTION LABORATORY
PPT slides - MACHINE PERCEPTION LABORATORY
 
Notes on Simulation and GHDL
Notes on Simulation and GHDLNotes on Simulation and GHDL
Notes on Simulation and GHDL
 
Lecture 1 Slides -Introduction to algorithms.pdf
Lecture 1 Slides -Introduction to algorithms.pdfLecture 1 Slides -Introduction to algorithms.pdf
Lecture 1 Slides -Introduction to algorithms.pdf
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Eid Short Essay In English
Eid Short Essay In EnglishEid Short Essay In English
Eid Short Essay In English
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning
 

More from Andrey Karpov

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста
Andrey Karpov
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developer
Andrey Karpov
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Andrey Karpov
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error Examples
Andrey Karpov
 
PVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature OverviewPVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature Overview
Andrey Karpov
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
Andrey Karpov
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
Andrey Karpov
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Andrey Karpov
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
Andrey Karpov
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
Andrey Karpov
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
Andrey Karpov
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
Andrey Karpov
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
Andrey Karpov
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
Andrey Karpov
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
Andrey Karpov
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
Andrey Karpov
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Andrey Karpov
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
Andrey Karpov
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
Andrey Karpov
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
Andrey Karpov
 

More from Andrey Karpov (20)

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developer
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error Examples
 
PVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature OverviewPVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature Overview
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
 

Recently uploaded

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 

Recently uploaded (20)

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 

The Big Calculator Gone Crazy

  • 1. The Big Calculator Gone Crazy Author: Andrey Karpov Date: 05.09.2013 In this article I'm going to discuss a problem few people think of. Computer simulation of various processes becomes more and more widespread. This technology is wonderful because it allows us to save time and materials which would be otherwise spent on senseless chemical, biological, physical and other kinds of experiments. A computer simulation model of a wing section flow may help significantly reduce the number of prototypes to be tested in a real wind tunnel. Numerical experiments are given more and more trust nowadays. However, dazzled by the triumph of computer simulation, nobody notices the problem of software complexity growth behind it. People treat computer and computer programs just as a means to obtain necessary results. I'm worried that very few know and care about the fact that software size growth leads to a non-linear growth of the number of software bugs. It's dangerous to exploit a computer treating it just as a big calculator. So, that's what I think - I need to share this idea with other people. The Big Calculator At first I intended to entitle this article something like "If programmers can't create medicines, why can medics create programs?" Take some imaginary programmer - he is not allowed to develop and prepare medicines. The reason is obvious: he doesn't have the necessary education for that. However, it's not that simple with programming. It may seem that an imaginary medic who has learned how to program will by default be a successful and useful programmer - especially given that a more or less acceptable programming skill is much easier to master than organic chemistry and principles of medicine preparation. Here lies a trap. A computer experiment requires as much care as a real one. Lab workers are taught to wash test tubes after experiments and make sure they are sterile. But few really care about the problem of some array accidentally remaining uninitialized.
  • 2. Programmers are well aware that the more complex software is, the more complicated and subtle bugs occur in it. In other words, I'm speaking of non-linear growth of the number of bugs accompanying the code size growth. Programs performing chemical or any other scientific computations are far from being simple, aren't they? Here's where the danger is. It's OK that a medic-programmer makes mistakes. Any programmer, however skillful, makes them from time to time. What's not OK, people tend to trust these results more and more. You calculate something and go on with your business. Those engaged in programming as their professional activity know how dangerous this approach is. They know what an undefined behavior is and how a program may only pretend to work well. There are numbers of articles and books explaining how to correctly develop unit tests and ensure the correctness of computations. Such is the world of programmers. But the world of chemists/physicists/medics is not that way, I'm afraid. They never write a complex program - they simply don't think that way. They use the computer as if it were just a Big Calculator. This comparison was suggested by one of our readers. Let me quote his comment in full here, so that English-speaking readers could learn about it too, once the article is translated. I can tell you something on this subject from my own experience. Though being a professional programmer, I actually come of a family of physicists and have physics education. At the moment when I had to choose which university to enter, the call of blood was stronger than my faith in the bright future of IT. So, I entered a physics university, rather prestigious on the local scale, which in fact is a "kindergarten" supervised by a large research institute in my native city Nizhny Novgorod. People who know the subject will at once guess which research institute and which university I mean. While studying there, I quite naturally proved to be one of the best at programming (and mathematical methods of physical modeling in particular). It was at the same time I also figured out the following things: 1. Physicists tend to view the computer as a large multi-functional calculator allowing you to draw a graph of Eta versus Theta with Gamma going to infinity. As one can naturally expect, they are mainly interested in the graph itself, not the program. 2. As a consequence of the first fact, a programmer is not viewed as a profession. A programmer is just the guy who knows how to use the Big Calculator to draw the needed graph. They don't care which way it will be done - at all. Sorry, what did you say? Static analysis? Version control? Oh, come on, guys! C++ is the language of programmers; physicists write in FORTRAN! 3. As a consequence of the previous fact, anyone who's going to devote his life to writing programs to do physical modeling, even all-purpose ones, even tough as hell ones, is but an appendix to the Big Calculator. He's not even a person - just a kind of... By the way, it was not only me treated in such a way by physicists (I was just an ordinary student, after all) - but even the best computer modeling specialist in the research institute who taught a computational methods course at our university and who, when I turned to him as my thesis adviser while writing my term paper, said to me almost straightforward, "They will despise you, so be prepared to tolerate that". I didn't want to tolerate that and after graduation left the computer modeling area for the field where programmers are not thought to be untermenschen. I hope this example will help you understand why
  • 3. initiatives like introducing static analysis even into relatively large (about 20 or 30 developers) projects on computer modeling are a hopeless job. There simply might not be a person who knows what it is. And if such a person does happen to be in the team, they'll most likely trample him because they don't need no trendy programmer frills of yours. "We've been doing without them for a hundred years - and will do for more." Here's another story for those not bored yet. My father, though being a pensioner, still works at a very large defense engineering enterprise here, in Nyzhny Novgorod (it's the largest in our city and one of the largest ones across the country; again, those who know the subject will guess it ;) ). He's been programming in FORTRAN for his entire life. He started at the time when punched cards were in use. I don't blame him for not studying C++. It was already too late for him 10 years ago - and he's still goes on pretty well. However, there are certain security precautions at this enterprise 2/3 of the staff of which are engaged in programming one way or another: 1. No Internet. At all. You need literature - you go to the library. Stack Overflow? What's that? If you need to send an e-mail, you have to submit a written request to the boss explaining whom and what for you want to send it. Only a few chosen ones can use the Internet "against a receipt". Thank God, they have an internal network at least. 2. No administration rights on your computer. Perhaps this restriction makes sense for the white-collar mass, but I can't imagine a programmer feeling satisfied with it. 3. (Doesn't relate to the subject; just an illustration.) You can't even bring a cell phone with an integrated camera (have you seen ones without it nowadays?). As a result, even young employees write code in FORTRAN, while really skilled programmers are very few. I know that for sure because I trained some guy of 25 whom my father had recommended as a promising programmer. Here's my verdict: they are stuck in the 80-s there. Even given that they have pretty good salaries, I wouldn't go there for the world. These are just two examples from the intellectual elite's life. I don't mean to discredit anyone - they do their job well enough, but my heart is bleeding as I watch which windmills my father has to fight sometimes. (Thank God, I've managed to persuade him to start using git recently.) No OOP in a million-line project, no static analysis - nothing. It's just has to do with the human's trait to be very conservative about areas which are not one's strong points. Ilja Mayzus. The original comment. The core of this story is the ideology of treating the computer as a Big Calculator. In that case, you don't need to know any more about it than its younger brother, the pocket calculator, deserves. And it's the way it is actually used in many areas. Let's digress for a while and have a look inside the world of physics. Let's see how another theory finds a confirmation. To do this, I'll again have to quote a large extract from Bryan Greene's book "The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory" [1]:
  • 4. We all huddled around Morrison's computer in the office he and I shared. Aspinwall told Morrison how to bring his program up on the screen and showed us the precise form for the required input. Morrison appropriately formatted the results we had generated the previous night, and we were set to go. The particular calculation we were performing amounts, roughly speaking, to determining the mass of a certain particle species—a specific vibrational pattern of a string—when moving through a universe whose Calabi-Yau component we had spent all fall identifying. We hoped, in line with the strategy discussed earlier, that this mass would agree identically with a similar calculation done on the Calabi-Yau shape emerging from the space-tearing flop transition. The latter was the relatively easy calculation, and we had completed it weeks before; the answer turned out to be 3, in the particular units we were using. Since we were now doing the purported mirror calculation numerically on a computer, we expected to get something extremely close to but not exactly 3, something like 3.000001 or 2.999999, with the tiny difference arising from rounding errors. Morrison sat at the computer with his finger hovering over the enter button. With the tension mounting he said, "Here goes," and set the calculation in motion. In a couple of seconds the computer returned its answer: 8.999999. My heart sank. Could it be that space-tearing flop transitions shatter the mirror relation, likely indicating that they cannot actually occur? Almost immediately, though, we all realized that something funny must be going on. If there was a real mismatch in the physics following from the two shapes, it was extremely unlikely that the computer calculation should yield an answer so close to a whole number. If our ideas were wrong, there was no reason in the world to expect anything but a random collection of digits. We had gotten a wrong answer, but one that suggested, perhaps, that we had just made some simple arithmetic error. Aspinwall and I went to the blackboard, and in a moment we found our mistake: we had dropped a factor of 3 in the "simpler" calculation we had done weeks before; the true result was 9. The computer answer was therefore just what we wanted. Of course, the after-the-fact agreement was only marginally convincing. When you know the answer you want, it is often all too easy to figure out a way of getting it. We needed to do another example. Having already written all of the necessary computer code, this was not hard to do. We calculated another particle mass on the upper Calabi-Yau shape, being careful this time to make no errors. We found the answer: 12. Once again, we huddled around the computer and set it on its way. Seconds later it returned 11.999999. Agreement. We had shown that the supposed mirror is the mirror, and hence space-tearing flop transitions are part of the physics of string theory. At this I jumped out of my chair and ran an unrestrained victory lap around the office. Morrison beamed from behind the computer. Aspinwall's reaction, though, was rather different. "That's great, but I knew it would work," he calmly said. "And where's my beer?" I truly believe they are geniuses. But let's imagine for a moment that it were some ordinary students who used this approach to calculate an integral. I don't think programmers would take it seriously then. And what if the program generated 3 right away? The bug would be taken as the final evidence? I think it would clear up later, during a re-check by themselves or their scientist colleagues. Still, the "ideal spherical programmer in vacuum" is scared to death by this fact. This is how things are in reality. It's not only personal computers used in such a way - it's also cluster systems exploited for scientific computations. And what's most scary, people trust the results produced by
  • 5. programs. In the future we are going to deal with even more computations of this kind, and the price of having software bugs will become steeper too. Isn't it time to change something? Yes, nobody can forbid me to stick a plaster on a cut myself; I guess I can recommend some medicine to take when you've caught a cold. But not more than that. I cannot drill a tooth or write out a prescription. Don't you find it reasonable that developers creating a software system whose responsibility extends beyond certain scope should also confirm their skill? I know there exist various certifications. But I'm talking of a different thing now. Certification is intended to ensure that program code conforms to certain standards. It partially prevents slopwork, in an indirect way. But the range of areas where certification is a strict requirement is quite narrow. It obviously does not cover the whole set of areas and situations where careless use of the Big Calculator may do much harm. Example of the Danger I guess many of you find my worries too abstract. That's why I suggest examining a few real-life examples. There is the open-source package Trans-Proteomic Pipeline (TPP) designed to solve various tasks in biology. No doubt, it is used - by its developers and perhaps some third-party organizations. I believe that any bug in it is already a potential issue. And does it have bugs? Yes, it does; and still more are appearing. We checked this project one year ago and reported it in the blog-post "Analysis of the Trans-Proteomic Pipeline (TPP) project". What has changed since then? Nothing. The project is going on to develop and accumulate new bugs. The Big Calculator ideology has won. The developers are not writing a high-quality project with the minimum number of bugs possible. They simply solve their tasks; otherwise they would have reacted in some way to the last year article and considered introducing some static analysis tools. I don't mean they must necessarily choose PVS-Studio; there are numbers of other static code analyzers. The point is that their responsible application goes on collecting most trivial bugs. Let's see what fresh ones they've got. 1. Some bungler goes on writing incorrect loops In the previous article I mentioned incorrect loop conditions. The new package version has them too. double SpectraSTPeakList::calcDot(SpectraSTPeakList* other) { .... for (i = this->m_bins->begin(), j = other->m_bins->begin(); i != this->m_bins->end(), j != other->m_bins->end(); i++, j++) { d = (*i) * (*j); dot += d; }
  • 6. .... } PVS-Studio's diagnostic message: V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. spectrastpeaklist.cpp 504 In the check "i != this->m_bins->end(), j != other->m_bins->end()", the expression before the comma does not check anything. The ',' operator is used to execute expressions both to the right and to the left of it in the left-to-right order and return the value of the right expression. This is what the correct check should look like: i != this->m_bins->end() && j != other->m_bins->end() The same defect can be also found in the following fragments: • spectrastpeaklist.cpp 516 • spectrastpeaklist.cpp 529 • spectrastpeaklist.cpp 592 • spectrastpeaklist.cpp 608 • spectrastpeaklist.cpp 625 • spectrastpeaklist.cpp 696 2. Null pointer dereferencing This bug won't lead to outputting incorrect computation results - it will cause a crash instead, which is much better. However, it would be strange not to mention these bugs. void ASAPRatio_getDataStrctRatio(dataStrct *data, ....) { .... int *outliers, *pepIndx=NULL; .... //pepIndx doesn't change .... if(data->dataCnts[i] == 1 && pepIndx[i] == 0) data->dataCnts[i] = 0; .... } PVS-Studio's diagnostic message: V522 Dereferencing of the null pointer 'pepIndx' might take place. asapcgidisplay2main.cxx 534
  • 7. The same defect can be also found in the following fragments: • Pointer 'peptides'. asapcgidisplay2main.cxx 556 • Pointer 'peptides'. asapcgidisplay2main.cxx 557 • Pointer 'peptides'. asapcgidisplay2main.cxx 558 • Pointer 'peptides'. asapcgidisplay2main.cxx 559 • Pointer 'peptides'. asapcgidisplay2main.cxx 560 • Pointer 'pepIndx'. asapcgidisplay2main.cxx 569 3. Uncleared arrays static void clearTagNames() { std::vector<const char *>ptrs; for (tagname_set::iterator i = tagnames.begin(); i!=tagnames.end();i++) { ptrs.push_back(*i); } for (tagname_set::iterator j = attrnames.begin(); j!=attrnames.end();j++) { ptrs.push_back(*j); } tagnames.empty(); attrnames.empty(); for (size_t n=ptrs.size();n--;) { delete [] (char *)(ptrs[n]); // cast away const } } In this code the analyzer has caught two uncleared arrays at once: V530 The return value of function 'empty' is required to be utilized. tag.cxx 72 V530 The return value of function 'empty' is required to be utilized. tag.cxx 73 You should call the clear() function instead of empty(). 4. Uninitialized class objects class ExperimentCycleRecord {
  • 8. public: ExperimentCycleRecord() { ExperimentCycleRecord(0,0,0,True,False); } ExperimentCycleRecord(long lExperiment, long lCycleStart, long lCycleEnd, Boolean bSingleCycle, Boolean bRangleCycle) { .... } .... } PVS-Studio's diagnostic message: V603 The object was created but it is not being used. If you wish to call constructor, 'this->ExperimentCycleRecord::ExperimentCycleRecord(....)' should be used. mascotconverter.cxx 101 The ExperimentCycleRecord() constructor does not do what it is intended to; it doesn't initialize anything. The developer might be an excellent chemist, but if he doesn't know how to use the C++ language properly, his computations using uninitialized memory aren't worth a damn. It's like using a dirty test tube. Instead of calling another constructor, the line "ExperimentCycleRecord(0,0,0,True,False);" creates a temporary object which will be destroyed after that. This error pattern is discussed in detail in the article "Wade not in unknown waters. Part one". The same defect can be also found in the following fragments: • asapratiopeptideparser.cxx 57 • asapratiopeptidecgidisplayparser.cxx 36 • cruxdiscrimfunction.cxx 36 • discrimvalmixturedistr.cxx 34 • mascotdiscrimfunction.cxx 47 • mascotscoreparser.cxx 37 • tandemdiscrimfunction.cxx 35 • tandemkscoredf.cxx 37 • tandemnativedf.cxx 37 5. Comments breaching execution logic int main(int argc, char** argv) {
  • 9. .... if (getIsInteractiveMode()) //p->writePepSHTML(); //p->printResult(); // regression test? if (testType!=NO_TEST) { TagListComparator("InterProphetParser",testType, outfilename,testFileName); .... } PVS-Studio's diagnostic message: V628 It's possible that the line was commented out improperly, thus altering the program's operation logics. interprophetmain.cxx 175 After the 'if' operator, a few lines executing some operations were commented out. As a result, the program logic changed quite differently than expected. The programmer didn't want any actions to be done after executing the condition. Instead, the 'if' operator affects the code below. As a consequence, the tests' output now depends not only on the "testType!=NO_TEST" condition, but on the "getIsInteractiveMode()" condition as well. That is, the test may not test anything. That's why I strongly recommend not to rely fully on one testing methodology alone (for example, TDD). 6. Misprints Misprints are to be found everywhere and all the time. It's not that bad if you get fewer hit points after an explosion in a game than you ought to, because of such a bug. But what do incorrect data mean when computing chemical reactions? void ASAPRatio_getProDataStrct(proDataStrct *data, char **pepBofFiles) { .... if (data->indx == -1) { data->ratio[0] = -2.; data->ratio[0] = 0.; data->inv_ratio[0] = -2.; data->inv_ratio[1] = 0.;
  • 10. return; } .... } PVS-Studio's diagnostic message: V519 The 'data->ratio[0]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 130, 131. asapcgidisplay2main.cxx 131 One and the same variable is by mistake assigned two different values. The correct code is this one: data->ratio[0] = -2.; data->ratio[1] = 0.; This fragment was then copied-and-pasted into other parts of the program: • asapcgidisplay2main.cxx 338 • asapcgidisplay2main.cxx 465 • asapratioproteincgidisplayparser.cxx 393 • asapratioproteincgidisplayparser.cxx 518 7. Comparing signed and unsigned values Comparing signed and unsigned values properly requires some skill. Ordinary calculators don't deal with unsigned values, but the C++ language does. size_type size() const; void computeDegenWts() { .... int have_cluster = 0; .... if ( have_cluster > 0 && ppw_ref.size() - have_cluster > 0 ) .... } PVS-Studio's diagnostic message: V555 The expression 'ppw_ref.size() - have_cluster > 0' will work as 'ppw_ref.size() != have_cluster'. proteinprophet.cpp 6767 The programmer wanted the check "ppw_ref.size() > have_cluster" to be executed. But he got something very different instead.
  • 11. To make it clearer, let's assume we have the 'size_t' type which is 32-bit. Suppose the function "ppw_ref.size()" returns 10 while the variable have_cluster equals 15. The function ppw_ref.size() returns the unsigned type 'size_t'. According to the C++ rules, the right operand in the subtraction operation must also have the type 'size_t' before subtraction is executed. It's alright for now: we have 10u on the left and 15u on the right. Here goes the subtraction: 10u - 15u And here's where we get a trouble. Those very C++ rules tell us that the result of subtraction between two unsigned variables must also be unsigned. It means that 10u - 15u = FFFFFFFBu. As you know, 4294967291 is greater than 0. The Big Calculator Riot is successful. Writing a correct theoretical algorithm is only half of the job. You also need to write a correct code. A similar bug can be found in the following fragment: double SpectraSTPeakList::calcXCorr() { .... for (int tau = -75; tau <= 75; tau++) { float dot = 0.0; for (unsigned int b = 0; b < numBins; b++) { if (b + tau >= 0 && b + tau < (int)numBins) { dot += (*m_bins)[b] * theoBins[b + tau] / 10000.0; } } .... .... } PVS-Studio's diagnostic message: V547 Expression 'b + tau >= 0' is always true. Unsigned type value is always >= 0. spectrastpeaklist.cpp 2058 As you can see, the variable 'tau' takes values within the range [-75, 75]. To avoid array overrun, the check b + tau >= 0 is used. I guess you've already got the point that this check won't work. The variable 'b' has the
  • 12. 'unsigned' modifier. It means that the "b + tau" expression's result is unsigned too. And an unsigned value is always greater than or equal to 0. 8. Strange loop const char* ResidueMass::getStdModResidues(....) { .... for (rmap::const_iterator i = p.first; i != p.second; ++i) { const cResidue &r = (*i).second; if (r.m_masses[0].m_nterm) { n_term_aa_mod = true; } else if (r.m_masses[0].m_cterm) { c_term_aa_mod = true; } return r.m_residue.c_str(); } if(! strcmp(mod, "+N-formyl-met (Protein)")) { return "n"; } if (! strcmp(mod, "13C6-15N2 (K)")) { return "K"; } if (! strcmp(mod, "13C6-15N4 (R)")) { return "R"; .... } PVS-Studio's diagnostic message: V612 An unconditional 'return' within a loop. residuemass.cxx 1442 There is the 'return' operator inside the loop and it's called in any case. The loop can execute only once, after which the function terminates. It's either a misprint here or some condition is missing before the 'return' operator. 9. Rough calculations double RTCalculator::getUsedForGradientRate() { if (rts_.size() > 0)
  • 13. return used_count_ / rts_.size(); return 0.; } PVS-Studio's diagnostic message: V636 The 'used_count_ / rts_.size()' expression was implicitly casted from 'int' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. rtcalculator.cxx 6406 Since the function returns values of the double type, I find it reasonable to suppose the following. When the variable 'used_count_' is assigned the value 5 and the function rts_.size() returns 7, the approximate result is 0,714. However, the function getUsedForGradientRate() will return 0 in this case. The variable 'used_count_' has the 'int' type. The rts_.size() function also returns an 'int' value. An integer division occurs, and the result is obvious: it's zero. Then zero is implicitly cast to double, but it doesn't matter at this point. To fix the defect, the code should be rewritten in the following way: return static_cast<double>(used_count_) / rts_.size(); Other defects of this kind: • cgi_pep3d_xml.cxx 3203 • cgi_pep3d_xml.cxx 3204 • asapratiopeptideparser.cxx 4108 10. Great and mighty Copy-Paste The function setPepMaxProb() contains a few large similarly looking blocks. In this fragment one can feel that specific smell of the Copy-Paste technique. Using it naturally results in an error. I had to SIGNIFICANTLY abridge the sample text. The bug is very noticeable in the abridged code, but it's almost impossible to see it in the original code. Yeah, it's an advertisement of static analysis tools in general and PVS-Studio in particular. void setPepMaxProb( bool use_nsp, bool use_fpkm, bool use_joint_probs, bool compute_spectrum_cnts ) { double prob = 0.0; double max2 = 0.0; double max3 = 0.0; double max4 = 0.0; double max5 = 0.0;
  • 14. double max6 = 0.0; double max7 = 0.0; .... if ( pep3 ) { ... if ( use_joint_probs && prob > max3 ) ... } .... if ( pep4 ) { ... if ( use_joint_probs && prob > max4 ) ... } .... if ( pep5 ) { ... if ( use_joint_probs && prob > max5 ) ... } .... if ( pep6 ) { ... if ( use_joint_probs && prob > max6 ) ... } .... if ( pep7 ) { ... if ( use_joint_probs && prob > max6 ) ... } .... } V525 The code containing the collection of similar blocks. Check items 'max3', 'max4', 'max5', 'max6', 'max6' in lines 4664, 4690, 4716, 4743, 4770. proteinprophet.cpp 4664 PVS-Studio's diagnostic message: V525 The code containing the collection of similar blocks. Check items 'max3', 'max4', 'max5', 'max6', 'max6' in lines 4664, 4690, 4716, 4743, 4770. proteinprophet.cpp 4664 Unfortunately, the V525 diagnostic produces many false positives and therefore referred to the third-level warnings. But if one overcomes one's laziness and study this class of warnings, one may find numbers of such nice bugs. 11. Pointer is not initialized sometimes int main(int argc, char** argv) { .... ramp_fileoffset_t *pScanIndex; .... if ( (pFI=rampOpenFile(mzXmlPath_.c_str()))==NULL) { .... } else {
  • 15. .... pScanIndex = readIndex(pFI, indexOffset, &iAnalysisLastScan); .... } .... if (pScanIndex != NULL) free(pScanIndex); return 0; } PVS-Studio's diagnostic message: V614 Potentially uninitialized pointer 'pScanIndex' used. sqt2xml.cxx 476 This program may crash at the end if the function rampOpenFile() returns NULL. It's not critical yet unpleasant. Here's another variable that may remain uninitialized: • Potentially uninitialized pointer 'fp_' used. dta-xml.cpp 307 12. Virtual destructor missing class DiscriminantFunction { public: DiscriminantFunction(int charge); virtual Boolean isComputable(SearchResult* result) = 0; virtual double getDiscriminantScore(SearchResult* result) = 0; virtual void error(int charge); protected: int charge_; double const_; }; // class class CometDiscrimFunction : public DiscriminantFunction;
  • 16. class CruxDiscrimFunction : public DiscriminantFunction; class InspectDiscrimFunction : public DiscriminantFunction; ..... class DiscrimValMixtureDistr : public MixtureDistr { .... DiscriminantFunction* discrim_func_; .... }; DiscrimValMixtureDistr::~DiscrimValMixtureDistr() { delete[] posinit_; delete[] neginit_; delete discrim_func_; } PVS-Studio's diagnostic message: V599 The virtual destructor is not present, although the 'DiscriminantFunction' class contains virtual functions. discrimvalmixturedistr.cxx 206 A number of classes are inherited from the DiscriminantFunction class. For example, such is the class DiscrimValMixtureDistr. Its destructor frees memory; therefore, it's very desirable that you call it. Unfortunately, the DiscriminantFunction class's destructor is not declared as a virtual one - with all the ensuing consequences. 13. Miscellaneous There are numbers of small defects which won't have serious consequences but are still not very pleasant to have in your code. There are also strange fragments, but I can't say for sure if they are incorrect. Here's one of them: Boolean MixtureModel::iterate(int counter) { .... if (done_[charge] < 0) { done_[charge]; }
  • 17. else if (priors_[charge] > 0.0) { done_[charge] += extraitrs_; } .... } PVS-Studio's diagnostic message: V607 Ownerless expression 'done_[charge]'. mixturemodel.cxx 1558 What is it? Incomplete code? Or maybe the programmer just wanted to point it out that nothing should be done if the "done_[charge] < 0" condition is true? And here you are an incorrect way of freeing memory. Any critical consequences are unlikely, but still the code smells. string Field::getText(....) { .... char* pepString = new char[peplen + 1]; .... delete pepString; .... } PVS-Studio's diagnostic message: V611 The memory was allocated using 'new T[]' operator but was released using the 'delete' operator. Consider inspecting this code. It's probably better to use 'delete [] pepString;'. pepxfield.cxx 1023 The correct way of doing this is to write "delete [] pepString". There are many other defects of this kind: • cruxdiscrimvalmixturedistr.cxx 705 • cruxdiscrimvalmixturedistr.cxx 715 • mascotdiscrimvalmixturedistr.cxx 426 • mascotdiscrimvalmixturedistr.cxx 550 • mascotdiscrimvalmixturedistr.cxx 624 • phenyxdiscrimvalmixturedistr.cxx 692 • probiddiscrimvalmixturedistr.cxx 487 • probiddiscrimvalmixturedistr.cxx 659 • tandemdiscrimvalmixturedistr.cxx 731 • tandemdiscrimvalmixturedistr.cxx 741
  • 18. And here's an incorrect implementation of the "--" operator. It doesn't seem to be used anywhere, otherwise the bug would quickly reveal itself. CharIndexedVectorIterator operator++(int) { // postincrement CharIndexedVectorIterator _Tmp = *this; ++m_itr; return (_Tmp); } CharIndexedVectorIterator& operator--() { // predecrement ++m_itr; return (*this); } PVS-Studio's diagnostic message: V524 It is odd that the body of '--' function is fully equivalent to the body of '++' function. charindexedvector.hpp 81 The operators "--" and "++" are implemented in the same way. They must have been copied-and-pasted then: • charindexedvector.hpp 87 • charindexedvector.hpp 159 • charindexedvector.hpp 165 Let's stop here. It all is not very interesting, and the article is big enough. As usual, I'm urging the developers not to limit themselves to fixing only the mentioned defects. Download and check the project with PVS-Studio yourself. I could have missed many errors. We can even grant you a free registration key for some time. Summary Unfortunately, the article has appeared a bit tangled. What did the author want to say, after all? I'll try to repeat in a very brief form my ideas I want to share with you. 1. We are currently using more and more programs to perform scientific and engineering computations and simulate various processes, and we grow to trust them. 2. Programs get very complicated. Professional programmers understand it very well that one cannot approach the task of creating a software package for computer simulation in the same way as using
  • 19. a software calculator. The growth of software complexity leads to an exponential increase of the number of errors [2]. 3. It appears that physicists/biologists/medics cannot simply calculate something in the usual manner. One cannot ignore the software complexity increase and the consequences of incorrect computations arising from imperfect knowledge of a programming language. 4. In this article I've given arguments to prove that this is the real state of things. The first quotation tells us that people tend to treat the computer as an ordinary calculator. The second quotation just reaffirms this idea. The error samples discussed after that are meant to demonstrate that people really make mistakes when treating computer simulation software in such a way. So, my anxiety has solid ground. So, what shall we do? First of all, I'd like you to realize this problem and tell your colleagues from related areas. It's been clear to programmers for a long time that the software complexity growth and silly mistakes in large projects may easily turn into a source of great harm. On the other hand, those people who treat programming and computers just as a tool don't know that and don't bother to think about it. So, we need to draw their attention to this problem. Here you are an analogy. Imagine a man who has got him a cudgel and starts hunting some animals. The cudgel in his hands gradually turns into a stone axe, then a sword, and finally a gun. But he still uses it just to stun hares by hitting them on the head. It's not only that this way of using the weapon is absolutely inefficient; it has also become much more dangerous now (he can accidentally shoot himself or his fellow men). Hunters from the "programmers" tribe quickly adapt themselves to these changes. The rest don't have time for that - they are busy hunting hares. After all, it's all about the hares. We need to tell these people that they have to learn, whether they like it or not. It'll improve everyone's life. And waving your gun around is no good. References 1. Bryan Greene "The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory. ISBN 978-0375708114 2. Andrey Karpov. Feelings confirmed by numbers. http://www.viva64.com/en/b/0158/