Interesting Observations (7 Sins of Programmers); The compiler is to blame; Archeological strata; The last line effect; Programmers are the smartest; Security, security! But do you test it?; You can’t know everything; Seeking a silver bullet.
3. A few words about the speaker
• Andrey Nikolaevich Karpov, candidate of physical
and mathematical sciences
• CTO at OOO «Program Verification Systems»
• Microsoft MVP for Visual C++
• Intel Black Belt Software Developer
• One of the PVS-Studio project founders
(a static code analyzer for C/C++).
www.viva64.com
4. 212 open-source and a few proprietary
projects
• CoreCLR
• LibreOffice
• Qt
• Chromium
• Tor
• Linux kernel
• Oracle VM VirtualBox
• Wine
• TortoiseGit
• PostgreSQL
• Firefox
• Clang
• Haiku OS
• Tesseract
• Unreal Engine
• Scilab
• Miranda NG
• ….
www.viva64.com
6. Interesting Observations
(7 Sins of Programmers)
1. The compiler is to blame
2. Archeological strata
3. The last line effect
4. Programmers are the smartest
5. Security, security! But do you test it?
6. You can’t know everything
7. Seeking a silver bullet
www.viva64.com
7. Observation No. 1
• Programmers sometimes can’t resist the urge to blame the compiler
for their own mistakes.
www.viva64.com
8. «The Compiler Is to Blame for Everything»
Ffdshow
TprintPrefs::TprintPrefs(....)
{
memset(this, 0, sizeof(this)); // This doesn't seem to
// help after optimization.
dx = dy = 0;
isOSD = false;
xpos = ypos = 0;
align = 0;
linespacing = 0;
sizeDx = 0;
sizeDy = 0;
...
}
www.viva64.com
9. Observation No. 2
• You can sometimes see in the program text the traces of big
modifications that have caused hidden bugs
• Replacement: char → TCHAR / wchar_t
• Replacement: malloc → new
• Migration: 32-bit → 64-bit
www.viva64.com
10. char → TCHAR / wchar_t
WinMerge
int iconvert_new(LPCTSTR source, .....)
{
LPTSTR dest = (LPTSTR) malloc(_tcslen (source) + 1 + 10);
int result = -3;
if (dest)
{
_tcscpy (dest, source);
....
}
www.viva64.com
11. malloc → new
V8
void ChoiceFormat::applyPattern(....)
{
....
UnicodeString *newFormats = new UnicodeString[count];
if (newFormats == 0) {
status = U_MEMORY_ALLOCATION_ERROR;
uprv_free(newLimits);
uprv_free(newClosures);
return;
}
....
} www.viva64.com
13. Observation No. 3. The Last Line Effect
• About mountaineers;
• Statistics collected from the database when it
included about 1500 code samples.
• 84 relevant fragments found.
• In 43 of them, the error was found in the last line.
TrinityCore
inline Vector3int32& operator+=(const Vector3int32& other) {
x += other.x;
y += other.y;
z += other.y;
return *this;
}
www.viva64.com
14. The Last Line Effect
Source Engine SDK
inline void Init(
float ix=0,
float iy=0,
float iz=0,
float iw = 0 )
{
SetX( ix );
SetY( iy );
SetZ( iz );
SetZ( iw );
}
Chromium
if (access & FILE_WRITE_ATTRIBUTES)
output.append(ASCIIToUTF16("tFILE_WRITE_ATTRIBUTESn"));
if (access & FILE_WRITE_DATA)
output.append(ASCIIToUTF16("tFILE_WRITE_DATAn"));
if (access & FILE_WRITE_EA)
output.append(ASCIIToUTF16("tFILE_WRITE_EAn"));
if (access & FILE_WRITE_EA)
output.append(ASCIIToUTF16("tFILE_WRITE_EAn"));
break;
www.viva64.com
15. The Last Line Effect
qreal x = ctx->callData->args[0].toNumber(); Qt
qreal y = ctx->callData->args[1].toNumber();
qreal w = ctx->callData->args[2].toNumber();
qreal h = ctx->callData->args[3].toNumber();
if (!qIsFinite(x) || !qIsFinite(y) ||
!qIsFinite(w) || !qIsFinite(w))
minX=max(0, minX+mcLeftStart-2); Miranda IM
minY=max(0, minY+mcTopStart-2);
maxX=min((int)width, maxX+mcRightEnd-1);
maxY=min((int)height, maxX+mcBottomEnd-1);
www.viva64.com
16. The Last Line Effect
0
10
20
30
40
50
1 2 3 4 5
www.viva64.com
17. Observation No 4.
Programmers are the Smartest
• Programmers are really very smart, and are right almost all
the time
• Consequence 1: when they are occasionally wrong, it’s very
hard to convince them
• Consequence 2: programmers refuse to perceive and sort
out warnings output by the code analyzer
www.viva64.com
18. A comment on our article
Wolfenstein 3D
ID_INLINE mat3_t::mat3_t( float src[ 3 ][ 3 ] ) {
memcpy( mat, src, sizeof( src ) );
}
Diagnostic message V511: The sizeof() operator returns size
of the pointer, and not of the array, in 'sizeof(src)'
expression.
Except it doesn't. The sizeof() operator returns the size of the object, and src is
not a pointer - it is a float[3][3]. sizeof() correctly returns 36 on my machine.
www.viva64.com
19. One more example of an argument
>> And the last code fragment on the subject.
>> Only one byte is cleared here.
>> memset ( m_buffer, 0, sizeof (*m_buffer) );
Wrong. In this line, the same number of bytes is cleared as stored in the first
array item.
We do face issues like this
quite often.
www.viva64.com
20. Observation No. 5. Security, security!
But do you test it?
The example is similar to the one on the previous slide. SMTP Client.
typedef unsigned char uint1;
void MD5::finalize () {
...
uint1 buffer[64];
...
// Zeroize sensitive information
memset (buffer, 0, sizeof(*buffer));
...
}
www.viva64.com
21. Security, security! But do you test it?
• The compiler can (and even must) delete the unnecessary memset().
• See for details:
• http://www.viva64.com/en/d/0208/
• http://www.viva64.com/en/k/0041/
void Foo()
{
TCHAR buf[100];
_stprintf(buf, _T("%d"), 123);
MessageBox(
NULL, buf, NULL, MB_OK);
memset(buf, 0, sizeof(buf));
}
www.viva64.com
22. Security, security! But do you test it?
php
char* php_md5_crypt_r(const char *pw,const char *salt, char *out)
{
static char passwd[MD5_HASH_MAX_LEN], *p;
unsigned char final[16];
....
/* Don't leave anything around in vm they could use. */
memset(final, 0, sizeof(final));
return (passwd);
}
www.viva64.com
23. Security, security! But do you test it?
Linux-3.18.1
int E_md4hash(....)
{
int rc;
int len;
__le16 wpwd[129];
....
memset(wpwd, 0, 129 * sizeof(__le16));
return rc;
}
www.viva64.com
After our article, the memset() function was
replaced with memzero_explicit().
Note: usually using memset() is just fine (!), but
in cases where clearing out _local_ data at the
end of a scope is necessary, memzero_explicit()
should be used instead in order to prevent the
compiler from optimizing away zeroing.
24. Security, security! But do you test it?
void Foo()
{
TCHAR buf[100];
_stprintf(buf, _T("%d"), 123);
MessageBox(
NULL, buf, NULL, MB_OK);
RtlSecureZeroMemory(buf, sizeof(buf));
}
• RtlSecureZeroMemory()
• Similar functions
www.viva64.com
25. Security, security! But do you test it?
• PVS-Studio generates warning V597 on memset()
• We found this error in a huge number of projects:
• In total, we have found 169 instances of this error pattern in open-
source projects by now!
• eMulePlus
• Crypto++
• Dolphin
• UCSniff
• CamStudio
• Tor
• NetXMS
• TortoiseSVN
• NSS
• Apache HTTP Server
• Poco
• PostgreSQL
• Qt
• Asterisk
• Php
• Miranda NG
• LibreOffice
• Linux
• …
www.viva64.com
26. Observation No. 6. You Can’t Know Everything
• You can’t know everything. But ignorance is no excuse
• Since you’ve set about writing safe and reliable software, you
must constantly learn, learn, and learn again
• And also use tools like PVS-Studio
• Analyzers know of defects programmers aren’t even aware of!
• P.S. One of the examples with memset() was discussed earlier
www.viva64.com
27. Errors programmers aren’t aware of: strncat
char *strncat(
char *strDest,
const char *strSource,
size_t count
);
MSDN: strncat does not check for
sufficient space in strDest; it
is therefore a potential cause
of buffer overruns. Keep in mind
that count limits the number of
characters appended; it is not a
limit on the size of strDest.
www.viva64.com
29. Errors programmers aren’t aware of : char c =
memcmp()
This error caused a severe vulnerability in MySQL/MariaDB up to versions 5.1.61, 5.2.11, 5.3.5, 5.5.22.
The point about it is that when a new MySQL /MariaDB user logs in, the token (SHA of the password
and hash) is calculated and compared to the expected value by the 'memcmp' function. On some
platforms, the return value may fall out of the [-128..127] range, so in 1 case out of 256, the procedure
of comparing the hash to the expected value always returns 'true' regardless of the hash. As a result,
an intruder can use a simple bash-command to gain root access to the vulnerable MySQL server even if
they don’t know the password.
typedef char my_bool;
...
my_bool check(...) {
return memcmp(...);
}
Find out more: Security vulnerability in MySQL/MariaDB - http://seclists.org/oss-sec/2012/q2/493
www.viva64.com
30. Observation No. 7.
Seeking a Silver Bullet
• TDD, code reviews, dynamic analysis, static analysis …
• Every method has its own pros and cons
• Don’t seek just one single methodology or tool to make your code
safe
www.viva64.com
31. Weaknesses of unit tests
• There might be mistakes in tests, too
• Example. A test is run only when getIsInteractiveMode() returns true:
Trans-Proteomic Pipeline
if (getIsInteractiveMode())
//p->writePepSHTML();
//p->printResult();
// regression test?
if (testType!=NO_TEST) {
TagListComparator("InterProphetParser",
testType,outfilename,testFileName);
www.viva64.com
32. Weaknesses of code review
• The reviewer gets tired very quickly
• It’s too expensive
OpenSSL
if (!strncmp(vstart, "ASCII", 5))
arg->format = ASN1_GEN_FORMAT_ASCII;
else if (!strncmp(vstart, "UTF8", 4))
arg->format = ASN1_GEN_FORMAT_UTF8;
else if (!strncmp(vstart, "HEX", 3))
arg->format = ASN1_GEN_FORMAT_HEX;
else if (!strncmp(vstart, "BITLIST", 3))
arg->format = ASN1_GEN_FORMAT_BITLIST;
else
.... www.viva64.com
33. Weaknesses of code review
• The reviewer gets tired very quickly
• It’s too expensive
OpenSSL
if (!strncmp(vstart, "ASCII", 5))
arg->format = ASN1_GEN_FORMAT_ASCII;
else if (!strncmp(vstart, "UTF8", 4))
arg->format = ASN1_GEN_FORMAT_UTF8;
else if (!strncmp(vstart, "HEX", 3))
arg->format = ASN1_GEN_FORMAT_HEX;
else if (!strncmp(vstart, "BITLIST", 3))
arg->format = ASN1_GEN_FORMAT_BITLIST;
else
.... www.viva64.com
34. Something dynamic analysis is bad at
const unsigned char stopSgn[2] = {0x04, 0x66};
....
if (memcmp(stopSgn, answer, sizeof(stopSgn) != 0))
return ERR_UNRECOGNIZED_ANSWER;
if (memcmp(stopSgn, answer, sizeof(stopSgn)) != 0)
A parenthesis is in a wrong place. Only 1 byte is compared instead of 2.
There is no error from the viewpoint of dynamic analyzers. They just
can’t help you find it.
www.viva64.com
35. Something static analysis is bad at
unsigned nCount;
fscanf_s(stream, "%u", &nCount);
int array[10];
memset(array, 0, nCount * sizeof(int));
Is there an error in this code or not?
You can only find out after running the program.
www.viva64.com
36. Conclusion
• All tools are necessary, all tools are important
• The PVS-Studio static code analyzer is one of them
http://www.viva64.com/en/pvs-studio/
• Other static code analyzers:
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
www.viva64.com