The forgotten problems of 64-bit programs development


Published on

Though the history of 64-bit systems development makes more than a decade, the appearance of 64-bit version of OS Windows raised new problems in the sphere of development and testing applications. In the article there are considered some mistakes connected with 64-bit C/C++ code development to OS Windows. The reasons are explained according to which these mistakes didn't find their reflection in the articles devoted to the migration tasks and are unsatisfactorily detected by the majority of static analyzers.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The forgotten problems of 64-bit programs development

  1. 1. The forgotten problems of 64-bitprograms developmentAuthor: Andrey KarpovDate: 11.08.2007AbstractThough the history of 64-bit systems development makes more than a decade, the appearance of 64-bitversion of OS Windows raised new problems in the sphere of development and testing applications. Inthe article there are considered some mistakes connected with 64-bit C/C++ code development to OSWindows. The reasons are explained according to which these mistakes didnt find their reflection in thearticles devoted to the migration tasks and are unsatisfactorily detected by the majority of staticanalyzers.IntroductionThe history of 64-bit programs is not new and makes more than a decade already [1]. In 1991 the first64-bit microprocessor MIPS R4000 was released [2, 3]. Since that time the discussions concerningporting programs to 64-bit systems have started in forums and articles. There began a discussion of theproblems related to the 64-bit programs development in C language. The following questions werediscussed: which data model is better, what is long long and many others. Here, for example, is aninteresting collection of messages [4] from comp.lang.c newsgroup concerning using long long type in Clanguage, which, in its turn, was related to 64-bit systems appearance.The C language is one of the most widespread languages and it is sensitive to the change of the digitcapacity of data types. Because of its low-level features, it is necessary to constantly control thecorrectness of the program ported to a new system in this language. It is natural that with theappearance of 64-bit systems the developers all around the world faced the problems of providingcompatibility of the old source code with the new systems again. One of the indirect evidences of thedifficulty of program migration is a big number of data models which must be constantly taken intoconsideration. Data model is a correlation of the size of base types in a programming language. Picture 1shows the digit capacity of types in different data models, which we will refer to further on.
  2. 2. Picture 1. Data Models.Existing Publications and Tools in the Sphere of Verification of 64-bitApplicationsOf course, it was not the first stage of digit capacity change. Thats enough to recollect the transitionfrom 16-bit systems to 32-bit. Its natural that the acquired experience had a good influence on thestage of migration to 64-bit systems.But the migration to 64-bit systems had its own peculiarities because of which there appeared a numberof investigations and publications on these problems, for example [5, 6, 7].Errors of the following kinds were pointed out by the authors of those times: 1. Packing pointers in types of a smaller digit capacity. For example, placing a pointer into int type in the system with LP64 database will result in truncating the pointer value and impossibility to use it further on. 2. Using magic numbers. The danger consists in using such numbers as 4, 32, 0x80000000 and some others instead of special constants or using the sizeof() operator. 3. Some shift operations that do not take into account the increase of digit capacity of some types. 4. Using incorrect unions or structures not taking into account the alignment on different systems with different digit capacity. 5. Incorrect usage of bit fields. 6. Some arithmetic expressions. For example:int x = 100000, y = 100000, z = 100000;long long s = x * y * x;Some other more rare mistakes were also considered, but the main ones are mentioned in the list.
  3. 3. On the ground of the investigation of the question of verification of 64-bit code some solutions wereoffered that provide the diagnostics of dangerous constructions. For example, such verification wasrealized in Gimpel Software PC-Lint ( and Parasoft C++test( static analyzers.The following question arises: if 64-bit systems have existed for such a long period of time, as well asarticles concerning this problem, and even program tools that provide control over dangerousconstructions in the code, should we get back to this problem?Unfortunately, yes, we should. The reason is the progress of informational technologies. And theurgency of this question is related to fast spreading of 64-bit versions of OS Windows.The existing informational support and tools in the field of 64-bit technologies development went out ofdate and need fundamental reprocessing. But you will object, saying that there are many modernarticles (2005-2007) in the Internet concerning the problems of 64-bit applications development inC/C++ language. Unfortunately, they turn out to be no more than retelling older articles concerning new64-bit Windows version without taking into consideration its peculiarities and changes in technologies.The Untouched Problems of 64-bit Programs DevelopmentLet us start at the beginning. The authors of some articles dont take into consideration large memorycapacity that became available to modern applications. Of course, the pointers were 64-bit in ancienttimes yet, but such programs didnt have chance to use arrays of several gigabytes in size. As a result,both in old and new articles there appeared a whole stratum of errors related to incorrect indexing ofbig arrays. It is practically impossible to find a description of an error similar to the following:for (int x = 0; x != width; ++x) for (int y = 0; y != height; ++y) for (int z = 0; z != depth; ++z) BigArray[z * width * height + y * width + x] = InitValue;In this example the expression "z * width * height + y * width + x", which is used for addressing, has theint type, which means that the code will be incorrect if the arrays contain more that 2 GB of elements.On 64-bit systems one should use types like ptrdiff_t and size_t for a safer indexing of large arrays. Theabsence of a description of errors of this kind in the article can easily be explained. In the time when thearticles were written the machines with memory capacity, which makes it possible to store such arrayswere practically not available. Now it becomes a common task in programming, and we can watch witha great surprise how the code that has been serving faithfully for many years stopped working correctlydealing with big data arrays at 64-bit systems.The other stratum of problems, which has not been touched, is represented by errors related topossibilities and peculiarities of the C++ language. It also quite explicable why it happened so. During theintroduction of first 64-bit systems C++ language did not exist for them or was not spread. Thats whypractically all the articles are concerning problems in the field of C language. Modern authorssubstituted C with C/C++ but they didnt add anything new.But the absence of errors typical for C++ in the articles does not mean that they dont exist. There areerrors that show up during the migration of programs to 64-bit systems. They are related to virtual
  4. 4. functions, exceptions, overloaded functions and so on. You may get acquainted with such mistakes inthe article [8] in more detail. Let us give an example related to usage of virtual functions.class CWinApp { ... virtual void WinHelp(DWORD_PTR dwData, UINT nCmd);};class CSampleApp : public CWinApp { ... virtual void WinHelp(DWORD dwData, UINT nCmd);};Let us follow the life cycle of development of a certain application. Let us suppose that first it wasdeveloped in Microsoft Visual C++ 6.0. when WinHelp function in CWinApp class had the followingprototype:virtual void WinHelp(DWORD dwData, UINT nCmd = HELP_CONTEXT);It was correct to override the virtual function in CSampleApp class like it is shown in the example. Thenthe project was ported to Microsoft Visual C++ 2005 where the prototype of the function in CWinAppclass was changed so that the DWORD type changed into the DWORD_PTR type. The program willcontinue working correctly at a 32-bit system for the DWORD and DWORD_PTR types coincide here. Theproblem will show up during the compilation of the code on a 64-bit platform. There will come out twofunctions with identical names but with different parameters, as the result the users code will never beactivated.Besides the peculiarities of 64-bit programs development from the point of view of C++ language thereare other points to be paid attention to. For example, the peculiarities related to the architecture of 64-bit versions of Windows. Wed like to let developer know about possible problems and to recommendpaying more attention to testing 64-bit software.Now let us get back to the methods of verification of the source code using static analyzers. I think youhave already guessed that everything is not so nice here as it may seem. In spite of the declared supportfor diagnosing the peculiarities of 64-bit code, this support at the moment does not meet the necessaryconditions. The reason is that the diagnostic rules were created according to all those articles that donot take into account the peculiarities of the C++ language or processing large data arrays, that exceed 2GB.For Windows developers the case is somewhat worse. The main static analyzers are designed todiagnose 64-bit errors for the LP64 data model while Windows use the LLP64 data model [10]. Thereason is that 64-bit versions of Windows are young and older 64-bit systems were represented by Unix-like systems with LP64 data model.As an example let us consider the diagnostic message 3264bit_IntToLongPointerCast (port-10), which isgenerated by the Parasoft C++test analyzer.
  5. 5. int *intPointer;long *longPointer;longPointer = (long *)intPointer; //-ERR port-10C++test supposes that from the point of view of LP64 model this construction will be incorrect. But inthe scope of data model accepted in Windows this construction will be safe.Recommendations on Verification of 64-bit ProgramsOk, you will say, the problems of 64-bit program versions are urgent. But how to detect all the errors?It is impossible to give an exhaustive answer, but it is quite possible to give a number ofrecommendations that will make it possible to provide safe migration to 64-bit systems and to providethe necessary level of reliability. • Introduce the following articles to your colleagues who deal with 64-bit applications development: [7, 8, 9, 10, 11, 12, 13, 14, 15]. • Introduce the methodology of the static code analyzer: [16, 17, and 18] to your colleagues. The static code verification is one of the best ways of detecting errors of this type. It makes it possible to check the workability even of the parts of code, the work of which is difficult to be modeled at large data volumes, for example using the unit-tests methodology. • It will be useful for developers to get acquainted with such static analyzers as Parasoft C++test (, Gimpel Software PC-lint (, Abraxas Software CodeCheck ( • For Windows applications developers it will be especially useful to get acquainted with the specialized static analyzer, Viva64 ( designed for the LLP64 data model [19]. • Upgrade the system of unit-testing so that it includes processing of large arrays in the set of tests. You may get more detailed information about the necessity of this at large data volumes in [9], and also to learn how to organize the testing better. • Execute manual testing of the ported code at real, difficult tasks that use possibilities of 64-bit systems. The change of the architecture is too considerable change to rely on the automated testing systems completely.References 1. John R. Mashey, The Long Road to 64 Bits. 2. Wikipedia: MIPS architecture. 3. John R. Mashey, 64 bit processors: history and rationale. 4. John R. Mashey, The 64-bit integer type "long long": arguments and history. 5. 64-bit and Data Size Neutrality. 6. 64-Bit Programming Models: Why LP64?
  6. 6. 7. Hewlett-Packard. Transitioning C and C++ programs to the 64-bit data model.8. Andrey Karpov, Evgeniy Ryzhkov. 20 issues of porting C++ code on the 64-bit platform. Andrey Karpov. Evgeniy Ryzhkov. Problems of testing 64-bit applications. The Old New Thing: Why did the Win64 team choose the LLP64 model? Brad Martin, Anita Rettinger, and Jasmit Singh. Multiplatform Porting to 64 Bits. Migrating 32-bit Managed Code to 64-bit. Matt Pietrek. Everything You Need To Know To Start Programming 64-Bit Windows Systems. Microsoft Game Technology Group. 64-bit programming for Game Developers. John Paul Mueller. 24 Considerations for Moving Your Application to a 64-bit Platform. Wikipedia: Static code analysis. Sergei Sokolov. Bulletproofing C++ Code. Walter W. Schilling, Jr. and Mansoor Alam. Integrate Static Analysis Into a Software Development Process. Evgeniy Ryzhkov. Viva64: what is it and for whom is it meant?