Software diseases: memset
Upcoming SlideShare
Loading in...5
×
 

Software diseases: memset

on

  • 149 views

While analyzing the source codes of various programs I can't help creating associations that each program has a tendency to certain diseases. In many projects you can easily make out patterns of ...

While analyzing the source codes of various programs I can't help creating associations that each program has a tendency to certain diseases. In many projects you can easily make out patterns of incorrect code that can be found in different project files. In some programs these are Copy-Paste errors, while in others it's "unsigned_integer < 0"-like checks. Each project has its own disease. The sore of the next project (called MAME) we have checked is the memset() function.

Statistics

Views

Total Views
149
Views on SlideShare
149
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Software diseases: memset Software diseases: memset Document Transcript

  • Software diseases: memset Author: Andrey Karpov Date: 23.07.2012 While analyzing the source codes of various programs I can't help creating associations that each program has a tendency to certain diseases. In many projects you can easily make out patterns of incorrect code that can be found in different project files. In some programs these are Copy-Paste errors, while in others it's "unsigned_integer < 0"-like checks. Each project has its own disease. The sore of the next project (called MAME) we have checked is the memset() function. MAME is an emulator application designed to recreate the hardware of arcade game systems in software to preserve gaming history by preventing vintage games from being lost or forgotten [1]. Although almost all the project files have the ".c" extension, MAME is actually a C++ project. The source code's size is rather large - 110 Mbytes. Checking MAME with PVS-Studio was impossible before because it is built with MinGW on Windows. MinGW is a native software port of the GNU Compiler Collection (GCC) under Microsoft Windows [2]. It means that PVS-Studio has to provide correct support of the special features of the GCC syntax and special keywords. Support of MinGW has been available in PVS-Studio since version 4.70. It is not full yet, but it is enough to check most projects. MAME was one of the first projects to be analyzed. Note. While performing the analysis, there will be a lot of similar false reports. The odd code fragments are located in several macros widely used in various project parts. It seems at first that there are only false positives - scattered useful messages just get lost among them. However, you can easily fix it by adding just a few comments to suppress the warnings triggered by the macros. See the "Suppression of false alarms" section in the documentation to find out how to do that. Now let's study the errors we have detected. Incompletely cleared arrays As we have already said, you can find a lot of fragments in the MAME project where the memset function is used incorrectly. A typical mistake is filling only a part of an array. Consider a simple example: UINT32 m_pstars_regs[16]; static DRIVER_INIT( pstar ) { ... memset(state->m_pstars_regs, 0, 16); ... }
  • PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_pstars_regs'. pgm.c 4458 Number 16 means the number of items in the "m_pstars_regs" array. But it is the number of bytes being filled in the buffer that should be passed into the memset function. As a result, only a part of the array is filled with zeroes. This is the correct code: memset(state->m_pstars_regs, 0, 16 * sizeof(UINT32)); The mistake is trivial. Programmers often think that there are few trivial errors in their programs (see the second myth [3]). It's not so. It is very simple and silly mistakes that make up the largest part of errors found in programs. Do you think the error shown above is a single one? No. Here you are at least 8 other fragments where instances of the same mistake can be found: • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 4975 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 4996 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 5056 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_oldsplus_ram'. pgm.c 5780 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_oldsplus_regs'. pgm.c 5781 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_sysreg'. rungun.c 399 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_ttl_vram'. rungun.c 400 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_playfield_code'. malzak.c 392 In the example above, the number of items was defined by an absolute number. It is bad. You'd better calculate the array size instead of using constants. Unfortunately, it doesn't help to avoid the error we're discussing. UINT16 m_control_0[8]; #define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0])) static MACHINE_RESET( tumbleb ) { ... memset(state->m_control_0, 0,
  • ARRAY_LENGTH(state->m_control_0)); } PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_control_0'. tumbleb.c 2065 The ARRAY_LENGTH macro is used to calculate the number of array items. Again, it's incorrect. The programmer should have calculated the array size, not number of items it contains. There are two ways to fix it. The first: memset(state->m_control_0, 0, sizeof(state->m_control_0)); The second: memset(state->m_control_0, 0, ARRAY_LENGTH(state->m_control_0) * sizeof(UINT16)); These are some other fragments where arrays are not filled correctly in the same way: • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_read'. megadriv.c 7156 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_write'. megadriv.c 7157 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_cart_is_genesis'. megatech.c 426 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vol_ctrl'. nycaptor.c 841 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_rotate_ctrl'. wgp.c 949 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vreg'. othldrby.c 237 Misfortunes with the memset() function are over here, though I may have missed some mistakes. But it's time for another, as scary, function memcpy(). Incorrect use of the memcpy() function Let's look at a code that causes an array overrun: #define CHD_SHA1_BYTES 20 #define CHD_V4_HEADER_SIZE 108 #define CHD_MAX_HEADER_SIZE CHD_V4_HEADER_SIZE
  • static chd_error header_read(...., chd_header *header) { UINT8 rawheader[CHD_MAX_HEADER_SIZE]; ... memcpy(header->parentsha1, &rawheader[100], CHD_SHA1_BYTES); ... } PVS-Studio: V512 A call of the 'memcpy' function will lead to the '& rawheader[100]' buffer becoming out of range. chd.c 1870 The 'rawheader' array consists of 108 bytes. We want to copy its contents from byte 100 on. The trouble is we will reach outside the array boundaries. We can copy only 8 bytes, yet 20 bytes are actually copied. Unfortunately, I don't know how to fix this code, since I'm not familiar with the program logic. When using the memset() function, it often happens that only a part of an array is filled. Correspondingly, when you use the memset() function, there may often be errors causing only a part of an array to be copied. Consider the following sample: UINT16 m_spriteram16[0x1000]; UINT16 m_spriteram16_buffered[0x1000]; static WRITE32_HANDLER( deco32_buffer_spriteram_w ) { deco32_state *state = space->machine().driver_data<deco32_state>(); memcpy(state->m_spriteram16_buffered, state->m_spriteram16, 0x1000); } PVS-Studio: V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state- >m_spriteram16_buffered'. deco32.c 706 That's a small function. But it has an error. I think you have already guessed that multiplication by sizeof(UINT16) is missing. This is the correct code:
  • memcpy(state->m_spriteram16_buffered, state->m_spriteram16, 0x1000 * sizeof(UINT16)); The same error here: V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state- >m_spriteram16_2_buffered'. deco32.c 726 Misprints and Copy-Paste In any project you can see misprints and errors caused by using the Copy-Paste technology. There are few of them in some projects and quite many in others. In MAME these errors are not numerous, yet they are there. Let's study some of them. static WRITE8_HANDLER( tms70x0_pf_w ) { ... if( ((cpustate->pf[0x03] & 0x80) == 0) && ((data & 0x80) == 0x80 ) ) { ... } else if( ((data & 0x80) == 0x80 ) && ((cpustate->pf[0x03] & 0x80) == 0) ) { ... } ... } PVS-Studio: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 577, 584. tms7000.c 577 If you look close, you will notice that the first and the second conditions are identical. They have different order of comparisons, but it doesn't influence the result in any way. Consider the following example. class device_debug
  • { device_disasm_interface *m_disasm; ... int min_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->max_opcode_bytes() : 1; } int max_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->max_opcode_bytes() : 1; } } PVS-Studio: V524 It is odd that the body of 'max_opcode_bytes' function is fully equivalent to the body of 'min_opcode_bytes' function (debugcpu.h, line 150). debugcpu.h 151 The max_opcode_bytes() function is identical to the min_opcode_bytes() function. This is most likely incorrect. I suppose that the min_opcode_bytes() function was intended to be written as follows: int min_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->min_opcode_bytes() : 1; } Here are some other code fragments which are most probably misprints: • V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: ",(%d,". 9900dasm.c 670 • V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 549, 579. cdrom.c 549 • V501 There are identical sub-expressions 'offset != (0x370 >> 1)' to the left and to the right of the '&&' operator. decoprot.c 118 • V501 There are identical sub-expressions 'offset != (0x3c0 >> 1)' to the left and to the right of the '&&' operator. decoprot.c 118 • V501 There are identical sub-expressions 'offset != 0x2c / 2' to the left and to the right of the '&&' operator. decoprot.c 240
  • • V501 There are identical sub-expressions 'offset != 0xe' to the left and to the right of the '&&' operator. decoprot.c 447 Undefined behavior Quite a number of warnings generated by PVS-Studio for this project refer to shift operations. These operations lead to undefined behavior. Of course, when you use particular compilers, your code can work properly for many years. That's why we can call these errors potential. They may reveal themselves when moving to a different platform, compilers or optimization switches. To learn more about it, please see the article: "Wade not in unknown waters. Part three." [4]. Consider a couple of samples causing undefined behavior. The first sample: #define ATARIRLE_PRIORITY_SHIFT 12 #define ATARIRLE_PRIORITY_MASK ((~0 << ATARIRLE_PRIORITY_SHIFT) & 0xffff) PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The left operand '~0' is negative. atarig42.c 220 Any code fragment using the ATARIRLE_PRIORITY_MASK macro leads to undefined behavior. Negative numbers cannot be shifted. This macro should be rewritten in the following way: #define ATARIRLE_PRIORITY_MASK ((~(0u) << ATARIRLE_PRIORITY_SHIFT) & 0xffff) Here is another, larger, sample: UINT32 m_color1_mask; #define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0])) PALETTE_INIT( montecar ) { static const UINT8 colortable_source[] = { 0x00, 0x00, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x03, 0x03, 0x03, 0x02, 0x03, 0x01, 0x03, 0x00, 0x00, 0x00, 0x02, 0x00, 0x02, 0x01, 0x02, 0x02, 0x00, 0x10, 0x20, 0x30, 0x00, 0x04, 0x08, 0x0c, 0x00, 0x44, 0x48, 0x4c, 0x00, 0x84, 0x88, 0x8c, 0x00, 0xc4, 0xc8, 0xcc
  • }; ... for (i = 0; i < ARRAY_LENGTH(colortable_source); i++) { UINT8 color = colortable_source[i]; if (color == 1) state->m_color1_mask |= 1 << i; ... } ... } PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The right operand ('i' = [0..43]) is greater than or equal to the length in bits of the promoted left operand. firetrk.c 111 The 'colortable_source' array contains 44 items. Therefore, the 'i' loop counter takes values from 0 to 43. Number '1' has the int type - it cannot be shifted more than by 31 bits. If you shift it by more bits, it will cause undefined behavior according to the language standard. Since there are quite few warnings related to shifts, we won't cite them in the article. You can look through the list of these messages in the text file: mame-shift-ub.txt. Other errors Besides the functions memset() and memcpy(), there is memcmp() I've almost forgotten about. This function is from the same gang. Fortunately, I've found only one error related to use of this function in MAME. static const char *apr_magic = "ACT Apricot disk imagex1ax04"; FLOPPY_IDENTIFY( apridisk_identify ) { UINT8 header[APR_HEADER_SIZE]; floppy_image_read(floppy, &header, 0, sizeof(header)); if (memcmp(header, apr_magic, sizeof(apr_magic)) == 0) ... }
  • PVS-Studio: V579 The memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. apridisk.c 128 The sizeof() operator calculates the pointer size instead of the number of bytes in a string. As a result, only the first several bytes are compared. We can fix it by defining the 'apr_magic' variable as an array: static const char apr_magic[] = "ACT Apricot disk imagex1ax04"; This is an example of an expression which is always true: int m_led_extender; #define CARD_A 1 #define NO_EXTENDER 0 static WRITE8_DEVICE_HANDLER( pia_ic5_porta_w ) { ... else if ((state->m_led_extender != CARD_A)|| (state->m_led_extender != NO_EXTENDER)) ... } PVS-Studio: V547 Expression is always true. Probably the '&&' operator should be used here. mpu4.c 934 The "X != 1 || X != 0" condition is always true. Most probably, the '&&' operator should be written instead of the '||' operator. Use of a pointer before a check. I will cite only one example of this. I saw other V595 messages too but didn't note them down. In many cases the code works well, as the pointer never equals zero in those fragments. Here is an example of odd code: static void stv_vdp2_drawgfxzoom(..., const gfx_element *gfx, ...) { ... if (gfx->pen_usage && transparency == STV_TRANSPARENCY_PEN)
  • { ... } if( gfx ) { ... } ... } PVS-Studio: V595 The 'gfx' pointer was utilized before it was verified against nullptr. Check lines: 2457, 2483. stvvdp2.c 2457 At times I come across some strange code about which I cannot tell for sure if it has an error or not. Maybe there is a Copy-Paste mistake. And maybe everything is correct and the two code branches are actually intended to be identical. Here is an example: static DEVICE_START( deco16ic ) { ... if (intf->split) deco16ic->pf2_tilemap_16x16 = tilemap_create_device(device, get_pf2_tile_info, deco16_scan_rows, 16, 16, fullwidth ? 64 : 32, fullheight ? 64 : 32); else deco16ic->pf2_tilemap_16x16 = tilemap_create_device(device, get_pf2_tile_info, deco16_scan_rows, 16, 16, fullwidth ? 64 : 32, fullheight ? 64 : 32); ... }
  • PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. deco16ic.c 943 Regardless of the condition, one and the same action is performed. Here is another similar sample: int compute_res_net(int inputs, int channel, const res_net_info *di) { ... if (OpenCol) { rTotal += 1.0 / di->rgb[channel].R[i]; v += vOL / di->rgb[channel].R[i]; } else { rTotal += 1.0 / di->rgb[channel].R[i]; v += vOL / di->rgb[channel].R[i]; } ... } PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. resnet.c 628 Conclusion As usual, I will stress that these are probably quite not all the errors that PVS-Studio can find in MAME. The task of this article is to show that PVS-Studio is learning to check crossplatform projects. To know how exactly you can integrate into the make-file, please see the documentation. You may also ask us if you have any troubles analyzing projects built with MinGW. P.S. Review of analysis results currently implies that you need the Visual Studio environment where you can open the report and study it. Manual analysis of the report is very effortful. Maybe we will make a special tool in future that will allow you to conveniently review the report and perform code navigation without having Visual Studio installed. References 1. Wikipedia. MAME. http://www.viva64.com/go.php?url=900 2. Wikipedia. MinGW. http://www.viva64.com/go.php?url=901 3. Myths about static analysis. The second myth - expert developers do not make silly mistakes. http://www.viva64.com/en/b/0116/ 4. Wade not in unknown waters. Part three. http://www.viva64.com/en/b/0142/