PVS-Studio vs Chromium


Published on

Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium is one of the best projects we have checked with PVS-Studio.

Published in: Technology
1 Comment
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

PVS-Studio vs Chromium

  1. 1. PVS-Studio vs Chromium StudioAuthor: Andrey KarpovDate: 23.05.2011AbstractGood has won this time. To be more exact, source codes of the Chromium project have won. Chromium .is one of the best projects we have checked with PVS PVS-Studio.Chromium is an open-source web browser developed by Google and intended to provide users with fast source web-browserand safe Internet access. Chromium serves as the base for the Google Chrome browser. Moreover,Chromium is a preliminary version of Google Chrome as well as some other alternative web web-browsers.From the programming viewpoint, Chromium is a solution consisting of 473 projects. The general size ofthe source C/C++ code is about 460 Mbytes and the number of lines is difficult to count count.These 460 Mbytes include a lot of various libraries. If you exclude them, you will have about 155Mbytes. It is much less but still a lot of lines. Moreover, everything is relative, you know. Many of these .libraries were created by the Chromium developers within the task of creating Chromium itself.Although such libraries live by themselves, still we may refer them to the browser.Chromium had become the most quality and large project I have studied during testing of PVS-Studio.While handling the Chromium project it was not actually clear to us what was checking what: we have umfound and fixed several errors in PVS Studio related to C++ file analysis and support of a specific PVS-Studioprojects structure.Many aspects and methods used in Chromium show the quality of its source code. For instance, mostprogrammers determine the number of items in an array using the following construct construct:int XX[] = { 1, 2, 3, 4 };size_t N = sizeof(XX) / sizeof sizeof(XX[0]);Usually it is arranged as a macro of this kind kind:
  2. 2. #define count_of(arg) (sizeof(arg) / sizeof(arg[0]))This is a quite efficient and useful macro. To be honest, I have always used this very macro myself.However, it might lead to an error because you may accidentally pass a simple pointer to it and it willnot mind. Let me explain this by the following example:void Test(int C[3]){ int A[3]; int *B = Foo(); size_t x = count_of(A); // Ok x = count_of(B); // Error x = count_of(C); // Error}The count_of(A) construct works correctly and returns the number of items in the A array which is equalto three here.But if you apply by accident count_of() to a pointer, the result will be a meaningless value. The issue isthat the macro will not produce any warning for the programmer about a strange construct of thecount_of(B) sort. This situation seems farfetched and artificial but I had encountered it in variousapplications. For example, consider this code from the Miranda IM project:#define SIZEOF(X) (sizeof(X)/sizeof(X[0]))int Cache_GetLineText(..., LPTSTR text, int text_size, ...){ ... tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, SIZEOF(text), 0); ...}So, such errors may well exist in your code and youd better have something to protect yourself againstthem. It is even easier to make a mistake when trying to calculate the size of an array passed as anargument:void Test(int C[3]){ x = count_of(C); // Error}
  3. 3. According to the C++ standard, the C variable is a simple pointer, not an array. As a result, you mayoften see in programs that only a part of the array passed is processed.Since we have started speaking of such errors, let me tell you about a method that will help you find thesize of the array passed. You should pass it by the reference:void Test(int (&C)[3]){ x = count_of(C); // Ok}Now the result of the count_of(C) expression is value 3.Lets return to Chromium. It uses a macro that allows you to avoid the above described errors. This ishow it is implemented:template <typename T, size_t N>char (&ArraySizeHelper(T (&array)[N]))[N];#define arraysize(array) (sizeof(ArraySizeHelper(array)))The idea of this magic spell is the following: the template function ArraySizeHelper receives an array of arandom type with the N length. The function returns the reference to the array of the N lengthconsisting of char items. There is no implementation for this function because we do not need it. Forthe sizeof() operator it is quite enough just to define the ArraySizeHelper function. The arraysize macrocalculates the size of the array of bytes returned by the ArraySizeHelper function. This size is the numberof items in the array whose length we want to calculate.If you have gone crazy because of all this, just take my word for it - it works. And it works much betterthan the count_of() macro we have discussed above. Since the ArraySizeHelper function takes an arrayby the reference, you cannot pass a simple pointer to it. Lets write a test code:template <typename T, size_t N>char (&ArraySizeHelper(T (&array)[N]))[N];#define arraysize(array) (sizeof(ArraySizeHelper(array)))void Test(int C[3]){ int A[3]; int *B = Foo(); size_t x = arraysize(A); // Ok x = arraysize(B); // Compilation error
  4. 4. x = arraysize(C); // Compilation error}The incorrect code simply will not be compiled. I think its cool when you can prevent a potential erroralready at the compilation stage. This is a nice sample reflecting the quality of this programmingapproach. My respect goes to Google developers.Let me give you one more sample which is of a different sort yet it shows the quality of the code as well.if (!file_util::Delete(db_name, false) && !file_util::Delete(db_name, false)) { // Try to delete twice. If we cant, fail. LOG(ERROR) << "unable to delete old TopSites file"; return false;}Many programmers might find this code strange. What is the sense in trying to remove a file twice?There is a sense. The one who wrote it has reached Enlightenment and comprehended the essence ofsoftware existence. A file can be definitely removed or cannot be removed at all only in textbooks and insome abstract world. In the real system it often happens that a file cannot be removed right now andcan be removed an instance later. There may be many reasons for that: antivirus software, viruses,version control systems and whatever. Programmers often do not think of such cases. They believe thatwhen you cannot remove a file you cannot remove it at all. But if you want to make everything well andavoid littering in directories, you should take these extraneous factors into account. I encountered quitethe same situation when a file would not get removed once in 1000 runs. The solution was also thesame - I only placed Sleep(0) in the middle just in case.Well, and what about the check by PVS-Studio? Chromiums code is perhaps the most quality code Iveever seen. This is confirmed by the low density of errors weve managed to find. If you take theirquantity in general, there are certainly plenty of them. But if you divide the number of errors by theamount of code, it turns out that there are almost no errors. What are these errors? They are the mostordinary ones. Here are several samples:V512 A call of the memset function will lead to underflowof the buffer (exploded). platform time_win.cc 116void NaCl::Time::Explode(bool is_local, Exploded* exploded) const { ... ZeroMemory(exploded, sizeof(exploded)); ...}
  5. 5. Everybody makes misprints. In this case, an asterisk is missing. It must be sizeof(*exploded).V502 Perhaps the ?: operator works in a different way than itwas expected. The ?: operator has a lower priority than the -operator. views custom_frame_view.cc 400static const int kClientEdgeThickness;int height() const;bool ShouldShowClientEdge() const;void CustomFrameView::PaintMaximizedFrameBorder(gfx::Canvas* canvas) { ... int edge_height = titlebar_bottom->height() - ShouldShowClientEdge() ? kClientEdgeThickness : 0; ...}The insidious operator "?:" has a lower priority than subtraction. There must be additional parentheseshere:int edge_height = titlebar_bottom->height() - (ShouldShowClientEdge() ? kClientEdgeThickness : 0);A meaningless check.V547 Expression count < 0 is always false. Unsigned type valueis never < 0. ncdecode_tablegen ncdecode_tablegen.c 197static void CharAdvance(char** buffer, size_t* buffer_size,size_t count) { if (count < 0) { NaClFatal("Unable to advance buffer by count!"); } else {
  6. 6. ...}The "count < 0" condition is always false. The protection does not work and some buffer might getoverflowed. By the way, this is an example of how static analyzers might be used to search forvulnerabilities. An intruder can quickly find code fragments that contain errors for further thoroughinvestigation. Here is another code sample related to the safety issue:V511 The sizeof() operator returns size of the pointer,and not of the array, in sizeof (salt) expression. commonvisitedlink_common.cc 84void MD5Update(MD5Context* context, const void* buf, size_t len);VisitedLinkCommon::FingerprintVisitedLinkCommon::ComputeURLFingerprint( ... const uint8 salt[LINK_SALT_LENGTH]){ ... MD5Update(&ctx, salt, sizeof(salt)); ...}The MD5Update() function will process as many bytes as the pointer occupies. This is a potentialloophole in the data encryption system, isnt it? I do not know whether it implies any danger; however,from the viewpoint of intruders, this is a fragment for thorough analysis.The correct code should look this way:MD5Update(&ctx, salt, sizeof(salt[0]) * LINK_SALT_LENGTH);Or this way:VisitedLinkCommon::FingerprintVisitedLinkCommon::ComputeURLFingerprint( ... const uint8 (&salt)[LINK_SALT_LENGTH]){
  7. 7. ... MD5Update(&ctx, salt, sizeof(salt)); ...}One more sample with a misprint:V501 There are identical sub-expressions host !=buzz::XmlConstants::str_empty () to the left and to the rightof the && operator. chromoting_jingle_glue iq_request.cc 248void JingleInfoRequest::OnResponse(const buzz::XmlElement* stanza) { ... std::string host = server->Attr(buzz::QN_JINGLE_INFO_HOST); std::string port_str = server->Attr(buzz::QN_JINGLE_INFO_UDP); if (host != buzz::STR_EMPTY && host != buzz::STR_EMPTY) { ...}The port_str variable must be actually checked as well:if (host != buzz::STR_EMPTY && port_str != buzz::STR_EMPTY) {A bit of classics:V530 The return value of function empty is required to be utilized. chrome_frame_npapi np_proxy_service.cc 293bool NpProxyService::GetProxyValueJSONString(std::string* output) { DCHECK(output); output->empty(); ...}It must be: output->clear();
  8. 8. And here is even the handling of a null pointer:V522 Dereferencing of the null pointer plugin_instance might takeplace. Check the logical condition. chrome_frame_npapi chrome_frame_npapi.cc 517bool ChromeFrameNPAPI::Invoke(...){ ChromeFrameNPAPI* plugin_instance = ChromeFrameInstanceFromNPObject(header); if (!plugin_instance && (plugin_instance->automation_client_.get())) return false; ...}One more example of a check that will never work:V547 Expression current_idle_time < 0 is always false. Unsignedtype value is never < 0. browser idle_win.cc 23IdleState CalculateIdleState(unsigned int idle_threshold) { ... DWORD current_idle_time = 0; ... // Will go -ve if we have been idle for a long time (2gb seconds). if (current_idle_time < 0) current_idle_time = INT_MAX; ...}
  9. 9. Well, we should stop here. I can continue but its starting to get boring. Remember that all this onlyconcerns the Chromium itself. But there are also tests with errors like this:V554 Incorrect use of auto_ptr. The memory allocated with new []will be cleaned using delete. interactive_ui_tests accessibility_win_browsertest.cc 306void AccessibleChecker::CheckAccessibleChildren(IAccessible* parent) { ... auto_ptr<VARIANT> child_array(new VARIANT[child_count]); ...}There are also plenty of libraries Chromium is actually based on, the total size of libraries being muchlarger than that of Chromium itself. They also have a lot of interesting fragments. It is clear that codecontaining errors might not be used anywhere, still they are the errors nonetheless. Consider one of theexamples (the ICU library):V547 Expression * string != 0 || * string != _ is always true.Probably the && operator should be used here. icui18n ucol_sit.cpp242U_CDECL_BEGIN static const char* U_CALLCONV_processVariableTop(...){ ... if(i == locElementCapacity && (*string != 0 || *string != _)) { *status = U_BUFFER_OVERFLOW_ERROR; } ...}The "(*string != 0 || *string != _)" expression is always true. Perhaps it must be: (*string == 0 || *string== _).
  10. 10. ConclusionPVS-Studio was defeated. Chromiums source code is one of the best we have ever analyzed. We havefound almost nothing in Chromium. To be more exact, we have found a lot of errors and this articledemonstrates only a few of them. But if we keep in mind that all these errors are spread throughout thesource code with the size of 460 Mbytes, it turns out that there are almost no errors at all.P.S.Im answering to the question: will we inform the Chromium developers of the errors weve found? No,we wont. It is a very large amount of work and we cannot afford doing it for free. Checking Chromium isfar from checking Miranda IM or checking Ultimate Toolbox. This is a hard work, we have to study all ofthe messages and make a decision whether there is an error in every particular case. To do that, wemust be knowledgeable about the project. We will this article to the Chromium developers, and shouldthey find it interesting, they will be able to analyze the project themselves and study all the diagnosticmessages. Yes, they will have to purchase PVS-Studio for this purpose. But any Google department caneasily afford this.