Static Code Analysis and AutoLint


Published on

Slides and notes presented at on Thursday, January 23, 2014, covering static code analysis and an internal perl tool AutoLint, which automates Gimpel PC-Lint runs over large legacy C/C++ codebases. (The per-slide notes contain most of the spoken content.)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Hello -- I’m Leander. I’ve worn a few hats for the last 8 years at 1st Playable Productions, a videogame developer around the corner. Previously I’ve been at Electronic Arts Los Angeles and Taldren in Costa Mesa.
    (Feel free to interrupt with questions as we go. I usually only bring talking points, but decided to script this out, so this format is a bit new to me; please excuse any rough edges.)
    We’ll start with a little bit on static code analysis in general, then dive into a perl tool I keep finding myself resurrecting for various projects and workplaces.
    Most of my static code analysis experience has been on largeish C-language-family codebases, but the bulk of this should be applicable to other languages and different project sizes.
    The quote here is from a fellow who ports a huge number of games to Linux. He’s a driving force behind the Simple DirectMedia Layer (SDL), which is a huge part of Valve’s linux efforts and hence the SteamBox launch, among other things.
  • Static code analysis can appear as a standalone tool, or as part of the compiler or interpreter. (I’ll just say “compiler” from here out.) It can work on raw source code or machine code, and any level in-between -- for example, preprocessed source or IL.
    Even syntax highlighting and “IntelliSense” style systems in IDEs and editors can -- and perhaps should -- be viewed as a form of static code analysis.
    Many compilers are pulling in more SCA-style features over time, either directly as warnings, or indirectly in the form of plugin support of separate passes. Supporting this type of use case was a major force behind the development of LLVM and clang, from what I understand.
    Two points worth bringing up here:
    You’d expect the compiler to catch syntax errors, but not all of them do! Some will let flaky syntax pass, which can be a portability concern. It’s nice to have someone saying “the syntax is nonstandard, I do not think that means what you think it means” occasionally as well.
    Some of these systems can enforce custom rules, like local coding standards, if you choose.
  • John Carmack, of id Software fame, wrote a detailed and eloquent article on static code analysis in general. It’s worth a read; there’s a link at the end. He covers many of these points in more detail. (As an aside, Carmack has left id Software, and is working with Oculus; if you haven’t played with an Oculus Rift VR development kit yet, you should seek out a demo.)
    Carmack makes a great point re code quality: at the end of the day, it’s less important than value. Both are nebulous terms, but I don’t think anyone here would disagree that code quality has an awful lot to do with value over the long run: it has a direct impact on maintenance costs, defect rates, and the like.
    Metrics may or may not be 100% accurate, but tracking something is the first step toward improvement. See CMU SEI’s Capability and Maturity Model and the Personal Software Process.
    I can’t emphasize the expertise part enough. It’s like having the brains of 10+year veteran developers in jars nearby. (Admittedly very chatty and slightly neurotic brains, but it’ll do.) Having the system around talking to you accelerates learning. I’ve heard many coops, interns, and coworkers praise it for these reasons.
    Regarding the chattiness: more verbose systems will sometimes generate spontaneous discussions. While these do occasionally devolve into bikeshedding, the majority of them have been beneficial; they seem to help the group become more introspective over time.
    As a side effect, someone occasionally creates an incredibly useful bit of code that makes it easier to avoid angering the Lint Gods.
  • When we first started analysis at my latest job, on a ~300k-500k LOC codebase, we had over 120 megabytes of text output from PC-Lint. Ouch. (Carmack uses the phrase “reams of commentary” here.)
    A lot of the noise and false positives come down to fixing configuration issues -- things like include search paths, making sure all compiler-internal defines are known to the lint process, and importing defines from a platform-specific build process.
    Some types of lint can automatically configure themselves -- if they’re built into the compiler, they can often be turned on just by flipping a switch. Others are more manual, but sometimes make up for it in flexibility. “Your Mileage May Vary” per tool.
    False negatives tend to decrease over time as compilers and linters get smarter. Or you may be able to detect them with some configuration changes or source code annotation.
  • Some of these aren’t as much of a problem for smaller projects or smaller teams.
    Regarding hyperlinks, having a reference nearby allows people to learn and turn around fixes a lot more quickly, especially if they’re new to lint. HTML+JavaScript can help here -- a description pane or popup or the like. This goes a long way to encourage spontaneous strafing runs at the issue list.
    Keeping the results of all prior runs is key. Even if you’re keeping these results mostly hidden, you enable many useful tricks. Being able to search through history after you fix a tough bug is amazing: you may find the five other instances of the bug lurking elsewhere in the codebase immediately. (I can’t count how many times I’ve done this.)
    That history can also be used for fun things like automatic #include / dependency reduction tools. Once the data is there, you’re only really limited by imagination.
    What’s “important” to draw attention to will vary per organization and per codebase. Find some way to emphasize that stuff, and ideally allow highlight rules to be changed easily over time. We’ll talk about AutoLint’s approach in a bit.
    Sometimes it is time to bite the bullet and suppress a message outright. Try to keep this obvious, and scoped as narrowly as possible: line of code, scope, or module… Plan on revisiting this list once or twice a year.
    Don’t give up if you have bad experiences with one tool. There are others, and they vary a lot. For example clang /analyze versus PC-Lint: the former may report on 2-3 issues in one of our games, the latter is still producing ~12 megabytes of text. But, hey: text is compressible!
    Annotations can get messy, but applied judiciously I think they’re a great thing. Sometimes minor refactoring can help both readability and lint; the “cooperate” quote there is from Carmack’s article.
  • This is by no means a complete list. Groupings and ordering are not significant.
    Each of these do vary considerably in their “feel” and verbosity, as well as how they’re configured.
    Carmack discusses Coverity, MS /analyze, PVS Studio, and PC-Lint in his article.
    When creating this slide, I nearly forgot SWELL (which stands for SWELL Will Examine Lua Listings, in the great tradition of GNU and such acronyms). It is an internal tool that just uses Perl to parse the output of “luac -l”, which dumps a bytecode listing. It reports typos and other undeclared variable usage, after comparing against a whitelist. It’s a great example of an extremely tiny lint tool, and would never have been created if I hadn’t been influenced by prior tools.
  • Just a quick divergence: if you haven’t tried runtime checking, you should! Of these, I can highly recommend valgrind (free) if you can use it; Application Verifier is also free and has been immensely useful in debugging Windows toolchain issues. The glibc debug switches (or equivalent for proprietary compilers) are worth turning on for at least one build of a project…
    Also, I still long for the days when we developed GameBoyAdvance games under VirtualBoyAdvance. Some of the nicest debugging and visualization tools out there in an ostensibly-for-homebrew-development project.
  • AutoLint is a set of Perl programs and tools that runs lint incrementally, once per checkin, and mails out the results.
    You’re seeing a snapshot of a mailout generated by the system. A prior version was a prettier and better organized XHTML+JavaScript page, but I haven’t resurrected that yet.
    A few things to point out:
    Summary statistics are right up front.
    “Clusters” denote categories of error we found important -- things that “just won’t die” and keep coming back to plague us.
    More on the next slide...
  • You can see some of PC-Lint’s verbosity here.
    Nevertheless, for a fairly significant commit, there are only ~7 new issues here. Most of these are probably OK, but could use double-checking. Some could disappear with minor style changes. Chances are most of this report will be ignored, but if it was a new file the author might tackle them a little more aggressively.
    This example doesn’t have anything seriously scary -- for example, “use after free” or “use uninitialized” -- but some do =)
  • AutoLint filters the mail pretty heavily, deduping, dropping some issues, hiding things you’ve seen before, etc. They’re still there if needed in the full report.
    We push results to users ASAP; less time between commit and report is best. They can go pull full reports (per-module or overall), but the push is the first layer. We explicitly don’t make this optional; people forget to go looking in a pull model very quickly.
    The “Clusters” are lists of message numbers that we consider important, based mostly on painful past experience. They get pulled to the top of the stats and reports.
    Ideally we set up so it’s run automatically as a part of build, before commit; some systems are easier to do this with than others. I really like IDEs that do this sort of thing continuously in the background, which seems to be getting more feasible these days.
    We’re still getting there on the “low maintenance” front, but it’s not too bad once it has been established for a given platform and iterated on a bit.
    Let’s dive into the structure and dependencies a bit...
  • All the “components” are perl, with some few supporting outside tools where appropriate.
    The runner coordinates everything, and is sort of the “main loop”, dealing with source control, calling the other modules as needed, actually running lint (currently via make which handles incremental lint a lot better; lint can produce .lob files or lint objects, making it very amenable to traditional incremental build tools). It also does user lookup and final mailout.
    Stats builds a few hash tables with counts: overall, per file, per error type, etc.
    Sort and dedupe pulls the error list down in size a bit; we get a lot of duplicates from local lint (and even some in global). No sense in reporting the same error on the same line more than once. This is all just little regexes and such.
    Diff is a bit special: we cheat to avoid noise from line number changes. Strip out anything that looks like a line number, diff using GNU diff (much faster than anything internal), then parse the diff and re-inject the line numbers. We don’t get it wrong often, and when we do we’re only off by lines in a file, rather than a whole file.
    (We do force PC-Lint to cooperate a bit by putting it in one-error-per-line mode, coincidentally in Visual Studio error message format. XML mode would also probably have been possible.)
    Dep generation just parses prior lint reports to give us a gcc -MD or -MMD style depfile, so make can do its job incrementally.
    Ldap lookup just converts svn usernames to mail addresses for us.
    Finally, we have Template::Toolkit templates for stats and mailout.
    In case you’re curious, line counts (approximate). This stuff is all pretty verbose and relatively high comment density.
    runner = 575
    stats = 200
    sort and dedupe = 35 (yay perl)
    diff (scrubbed) = 220
    dep gen = 75
    ldap = 30
    Makefile = 110
    templates = 40 (stats) and 30 (mail)
  • Jenkins is great. It suffers a little from feature bloat and longstanding bugs (they need lint, hah), but it is greatly preferred to rolling your own. Eventually it might be able to take over mailout duties, and we could drop the LDAP requirement and just generate logs. It’s just not that great about single-rev-at-a-time bumps.
    For “build incrementally with deps”, make is very terse. All of this stuff works fairly well as a traditional Directed Acyclic Graph: lobs and logs from lint, deps from logs, combined log, sorted log, diffs, stats, diffstats...
    We use GNU diff because the perl diff solutions were just too slow and too memory intensive. It’s not that much trouble working with tempfiles and command-line diff, and it’s _much_ faster; also more amenable to parallelization via make -j.
    It’s a no-brainer to move to direct SVN API use, but the last time we tried was back in SVN 1.5 or earlier, and extern management was abysmal. There was a similar reason for falling back to GNU find, but that is just laziness; File::Find can do this easily, we just have to be careful about depth-first traversal.
    Cygwin is a love/hate relationship for me. It’s so nice to have the unix environment around, but for tools that need to be stable, it makes much more sense to use gnuwin32 or the like. cygwin just seems to trade one race condition for another, and it’s really easy to trigger these when you’re constantly running parallel makes. msys or GnuWin32 might be a better choice...
  • I so love template toolkit… There are a few other similar packages out there, but TT has fit our needs. Even with the larger XHTML frontend.
    There was a tiny cygwin run0 bug that needed to be worked around even with IPC::Run3.
  • PC-Lint has a lot of existing .lnt files for various compilers, libraries, and even policies (a whole “Scott Meyers” set corresponding to the Effective C++ series, for example, and another for MISRA). When we are missing something (e.g. Metrowerks) we can do a pretty good job of scraping together just the missing .lnt file. It does take a little iteration to get right, but isn’t too terrible.
    Many of the lint programs can operate in an all-seeing “global” mode which attempts to see everything across all modules at once (and sometimes crashes or runs very slowly). If global is too slow or breaks, consider local per-module runs; conversely, if local is too noisy global may work. With AutoLint’s dedupe the local output is not too different from global, we just miss some useful “unused symbol” warnings -- they get very spammy.
    If you have externals, be prepared for a little pain re people who e.g. check into project before checking in to external. We have a little logic for fudging timestamps around, or just ignoring a particularly bad run if it looks like the next run might fix it. (Although “culprits” gets a bit hard in that case; we’d like to mail only you with your messages, not two of you with both your messages…) There are “best practice” solutions for this on the source control system side, but we don’t always have the luxury of being able to implement those, especially when working with clients.
    It’s very nice if this is as low maintenance as possible, but there also has to be a way to say “this all went to hell; start again from scratch as of this new rev” easily.
    Honestly: the new platform does take a while to spin up. We haven’t had this stuff running since DS development (ouch), despite the new platforms arguably being _easier_ to set up (gcc/clang based). I’d estimate a solid day or two of setup for this tool at the moment, for a new platform, then 1-2 weeks of hand-holding of an hour or so each day. I’m confident we can get these numbers down with iteration...
  • While having the text versions around is great (and nicely composable), having a nicer-looking frontend with color is very helpful. Easier to draw the eye to relevant info, and much easier to hyperlink off to “howto”s.
    A DB might provide more functionality, over time. Flatfiles are working pretty well at the moment, though.
    As touched on in the last slide: new platform setup costs are too high. Something that can learn from build logs would be a great step up, even if you had to sort through the autogenerated config. There’s still the issue of compiler-internal defines, but you can often coax those out.
    We’d like more facilities for suppressing or hiding at minimum scope. PC-Lint is actually pretty awesome at this -- wildcards for paths and symbols for specific suppressions, “scope” support for suppressions, etc. But we need to use it more, and come up with some best practices here. Ideally a user who knew what they were doing could alter some personal suppressions, which wouldn’t affect the master history but might keep them more sane; more than a few folks end up just not being able to deal with some of the verbosity.
  • Read that first link, you won’t regret it. =)
    Also, Perl::Critic is a very good thing. I use it from inside Padre all the time… It took a little while to get used to, but I’m much happier with my newer code now that I’ve adapted.
  • Static Code Analysis and AutoLint

    1. 1. Static Code Analysis (& AutoLint) Leander Hasty <leander at> “The more I push code through static analysis, the more I'm amazed that computers boot at all.” - Ryan C. Gordon (@icculus)
    2. 2. Static Code Analysis Essentially, pre-runtime code checking. Identifies: syntax errors likely logic / semantic errors “code smells” and best practices violations style violations • • • •
    3. 3. Benefits - Why should I care? “The most important thing I have done as a programmer in recent years is to aggressively pursue static code analysis. Even more valuable than the hundreds of serious bugs I have prevented with it is the change in mindset about the way I view software reliability and code quality.” - John Carmack •covers language weaknesses o often before the language can evolve o distilled experience of veteran devs •offsets human error •code quality •metrics (personal & organization) •canned expertise
    4. 4. Problems and Annoyances • • • verbosity erroneous warnings (false positives), due to: o o misconfiguration  include paths  defines  language variants code style missed problems (false negatives) “[...] if you have a large enough codebase, any class of error that is syntactically legal probably exists there.” - Carmack
    5. 5. Approaches and Solutions 1. Hyperlink to “what is this” and “how to fix”. o Helps with learning and burndown. 1. Keep a searchable, unfiltered history. 2. Promote important messages; hide (but don’t lose) less important ones. o If you must suppress, do so at narrowest level. 1. Try different lint tools. 2. Customize configuration: per project, per platform, and per team. 3. Annotate source as needed; “cooperate as much as possible” with lint.
    6. 6. A Few Examples C/C++ Coverity ($$$$$) Insure++ ($$$$-dyn too) MS /analyze (free/$$$$) PVS Studio ($$$$) Gimpel PC-Lint ($$$) clang /analyze (free) Visual Assist ($$-IDE) C# Coverity, ReSharper, FxCop Obj-C OCLint • • • • • • • • Bash shellcheck Lua metalua, lualint, luachecker, SWELL* Perl Perl::Critic B::Lint Python pychecker, pyflakes, pylint, pep8 ...and many others... • • • • •
    7. 7. Aside: Dynamic (Runtime) Checks • • • • BoundsChecker Insure++ IBM Rational valgrind o o • o cachegrind memcheck helgrind MS Application Verifier • • • • electricfence dmalloc dlmalloc glibc debug o o malloc stl Also, VMs and emulators!
    8. 8. AutoLint AutoLint for rev 2635 of [URL] committed by [name] at 2012-08-28 17:11:28 -0400 (Tue, 28 Aug 2012) See [URL] for full lint results. Overall: Scanned at least 187 files. Total 3412 errors. (change: -25) By type: 37 Errors. 1595 Warnings. (change: -6) 1111 Infos. (change: -4) 669 Notes. (change: -15) By cluster: cluster cluster cluster "const correctness": 171 (change: -8) "missing in initializer list": 57 "uninitialized members": 3
    9. 9. Mailout, continued... diff of lint-s-rev2609.txt versus lint-s-rev2635.txt ---------------------------------------------------------------------------NEW ENTRIES: --------------srcstatesSpotlightState.cpp(693): error 534: (Warning -- Ignoring return value of function 'game::ActorModelImpl::switchState(unsigned long)' (compare with line 72, file src3dActorModelImpl.h)) srcstatesSpotlightState.cpp(886): error 534: (Warning -- Ignoring return value of function 'game::ActorModelImpl::switchState(unsigned long)' (compare with line 72, file src3dActorModelImpl.h)) srcstatesSpotlightState.cpp(997): error 732: (Info -- Loss of sign (arg. no. 3) (long to unsigned long)) srcstatesSpotlightState.cpp(640): error 732: (Info -- Loss of sign (assignment) (long to unsigned long)) srcstatesSpotlightState.h(143): error 1762: (Info -- Member function 'game::SpotlightState::isTrinaBumped(void)' could be made const --- Eff. C++ 3rd Ed. item 3) srcstatesSpotlightState.cpp(646): error 641: (Warning -- Converting enum 'game::DanceStates' to 'int') srcstatesSpotlightState.cpp(886): error 641: (Warning -- Converting enum 'game::DanceStates' to 'int') ---------------------------------------------------------------------------OLD ENTRIES: --------------srcstatesSpotlightState.cpp(733): error 788: (Info -- enum constant 'game::SpotlightState::ActorStates::State::ENDWIN' not used within defaulted switch) srcstatesSpotlightState.cpp(323): error 534: (Warning -- Ignoring return value of function 'game::ActorModelImpl::switchState(unsigned long)' (compare with line 72, file src3dActorModelImpl.h)) [...]
    10. 10. AutoLint Goals • • • • • • • Have a full history for mining. Filter and hide, don’t suppress. Push important data. o o Allow pull / query, but don’t wait for it. Clusters to emphasize important issues. Lint incrementally: o o decrease time to report, and provide tailored reports, covering a single commit. Try to keep maintenance cost low. Decouple tools - “unix philosophy”. Very few external dependencies.
    11. 11. Structure Outputs are stats, diffstats, diff, and full report. Mail includes the full diff + diffstats header. Components: runner stats sort and dedupe diff (scrubbed) dep generation ldap lookup • • • • • • Additional: Makefile templates • •
    12. 12. External Dependencies • Jenkins (CI) • GNU make • GNU diff • o o o o kick off jobs as needed why? complexity / code size / maintenance why? speed and memory (Algorithm::Diff etc didn’t cut it) svn, find, cygwin - these are problematic: o o o parsing “svn” and “svnversion” is error-prone File::Find is far better in Perl cygwin increases software stack depth, crashes
    13. 13. Modules - a big part of “Why Perl?” • Template toolkit • IPC::Run3 o o o so much less code for output generation… subtools can return a lot of different ways tools use stdout and stderr • Mail::Sendmail • • • Net::LDAP XML / XSLT (prior versions) ...and others: Carp, English, Fcntl, File::Temp, IO::File, Readonly, ... o ...there’s probably a better choice here eventually...
    14. 14. New Platforms and Problems • .lnt config “include” / hierarchy helps… o o o compilers libs / SDKs policies (site, platform, project) • • • local vs global analysis externals / submodules = pain “culprits” • manual recovery path o o if the lint run hasn’t succeeded in a few revs, who gets the mailout? “start again here”
    15. 15. Future of AutoLint • HTML/CSS+JS and hyperlinks • • • database versus flatfile, if needed? return to global mode “learning” mode: • • • o o o search interface, reference pane watch this build, set yourself up based on the log perhaps gather deps from file open activity easier per-file / per-dir / per-lib customization per-user mailout customization integrate into local builds, pre-commit
    16. 16. Select References • • • • • “Static Code Analysis”, John Carmack: various, Bruce Dawson: “Bugs Considered Harmful with Douglas Crockford” (Hanselminutes podcast) Gimpel PC-Lint: Perl::Critic: Questions? Code may be available on request (we’ve given it out before, but it is a little specific to our use-case, and the CEO likes to know about interest).