Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Psychology of C# Analysis


Published on

Our C# expert Eric Lippert provides his take on the psychology of C# analysis, including the business case for C#, developer characteristics and analysis tools.

Published in: Technology

The Psychology of C# Analysis

  1. The Psychology of C# Analysis Eric Lippert C# Analysis Architect Coverity
  2. Intro
  3. Intro • Psychological factors in language design… • … and compiler error messages… • … and static analysis tools… • … and funny pictures of cats.
  4. Who is this guy? • Compiler developer / language designer at Microsoft from 1996 through 2012 • Visual Basic, VBScript, JScript, VS Tools for Office, C# / Roslyn • Static analysis architect for C# at Coverity since January • I will use “we” totally inconsistently • I have no formal background in static analysis • I take an engineering rather than academic approach
  5. This guy is you, not me
  6. Body
  7. The business case for C#
  8. The business case for C# • Productive, successful professional developers who target Microsoft platforms make those platforms more attractive to Microsoft’s customers • Original design goal was “a simple, modern, general- purpose language” • Any language with an 800 page specification is no longer simple, but modern and general-purpose still apply • Understanding developer psychology is key to achieving wide adoption of any developer tool
  9. Target C# Developer Characteristics • Professionals, not amateurs • Engineers, not hackers • Programming experts, not line-of-business experts • Pragmatists, not academics • Skeptics, not true believers • Conservatives, not radicals
  10. Conservatism
  11. Conservatism • C# developers hate breaking changes imposed by tools • Even trivial breaking changes are agonized over • In 11 years and 6 releases C# has never added a new reserved keyword • New keywords are contextual so as to not be breaking • This imposes considerable restrictions on new syntaxes • For example, consider iterator blocks: double yield = 123.4; yield return yield;
  12. Conservatism • C# app developers also hate breaking their users • Facilitating versionable components was a pri 1 design goal • Numerous seemingly-counterintuitive features actually mitigate brittle-base-class failures: class Base { public void M(int x) { } } class Derived : Base { public void M(double x) { } } ... derived.M(123); // Base.M or Derived.M?
  13. Conservatism
  14. Conservatism C# 4.0 added dynamic dispatch to facilitate interoperability with dynamic languages and “legacy” object models • Enormous MVP community pushback • I will use this feature correctly but my coworkers are going to abuse it and then I’m going to have to fix their god-awful hacked-up code • Anything that makes the compiler less capable of finding bugs is met with skepticism and resistance • Completely redesigned based on early feedback
  15. Error reporting psychology FAIL
  16. Error reporting psychology • Dealing with correct code is literally the smallest problem • “Roslyn” does syntactic analysis of broken code in the time between keystrokes; semantic analysis takes a little longer • Error messages need to be understandable, accurate, polite and diagnostic rather than prescriptive • Let’s take a look at some examples
  17. Error reporting psychology
  18. Error reporting psychology A params parameter must be the last parameter in a formal parameter list Is this saying: • If there is a params parameter, it must be the last one? or • The last parameter and only the last parameter must always be a params parameter? Or • The last parameter must be a params parameter; if others are as well, that’s fine too? The error is only clear if the feature is already understood
  19. Error reporting psychology Error messages must read the mind of a developer who wrote broken code and figure out what they meant. class C { public virtual static void M(){} }
  20. Error reporting psychology
  21. Error reporting psychology Complex operator + (Complex x, Complex y) { ... User-defined operator must be declared static and public • This is an example of a prescriptive error done right • The user absolutely positively has to do this to overload an operator • Odds that they were not trying to overload an operator are low
  22. Warnings are harder than errors
  23. Warnings are harder than errors • Must infer developers erroneous thoughts • Compiler must be fast • This makes an opportunity for third-party tools • Must be plausibly wrong • A warning for code that no one would reasonably type is unhelpful • Must be able to eliminate warning • And ideally the warning should tell you how • Must have low false positive rate • Encouraging developers to change correct code is harmful • We will return to this point later
  24. What do C# developers want? Rigidly defined areas of doubt and uncertainty • Static type checking, type safety, memory safety… • … that can be disabled if necessary. • A compiler that infers developer intent… • … with predictable behavior and understandable rules • Actionable errors when inference fails… • …rather than muddling on through and getting it wrong
  25. It hurts because its true
  26. C# was originally called SafeC C# throws developers into the “Pit of Success”: • Eliminate unimportant dangerous features entirely • switch fall through • Restrict dangerous features to clearly-marked unsafe code regions • Eliminate implementation-defined behaviours • x = ++x + x++; is well-defined in C# … • …but still a bad idea. • Define common undefined behaviours • Accessing an array out of bounds causes an exception • Mandate compiler warnings There are numerous defects that the Coverity C/C++ analysis checkers detect which are impossible, unlikely, or already warnings in C#. Let’s look at a few dozen. Quickly. These are all defects found by Coverity in C/C++ that are not worth checking in C#…
  27. C/C++ defects inapplicable to C#: • Local read before assignment • C# rejects programs that use uninitialized locals • Uninitialized fields / arrays • Fields and arrays are automatically zeroed out • Treating a pointer to a variable as a pointer to an array • Rare, must be marked as unsafe • Buffer length arithmetic errors • Strings and arrays know their lengths; checked at runtime • Pointer/integer/char/bool/enum type errors • Not inter-assignable in C# without explicit cast operators
  28. C/C++ defects inapplicable to C#: • Failure to consistently check error return codes • C# uses exceptions • Accidental sign extension • Either error or warning • Implementation-defined side effect order • Side effect order is well-defined • Statement with no effect • is actually a parse time error in C# • Accidental use of ambiguous names • C# requires that a simple name have a unique meaning in a block
  29. C/C++ defects inapplicable to C#: • sizeof mistakes • C#’s sizeof operator only takes types • Unintentional switch fall-through • Is an error • Unreachable code • Is a warning • Accidental assignment or comparison of variable to itself • Yep, that’s a warning too • Field never written or never read • Man that’s a lot of warnings • Missing return statement • Is illegal • malloc without free / free without malloc / allocator – deallocator mismatch / use after free • Not needed in a garbage-collected language • Dereferencing an address that lived longer than the storage it refers to • References to variables may not be stored in long-term storage • Accidental use of function pointer • Method group expressions can only be used in strictly limited locations • Overriding errors • The language was designed to mitigate brittle base class failures by default
  30. Of course the compiler is not perfect…
  31. Defects common to C/C++ and C# • Copy paste mistakes • Expression contains variables but always has the same result • You checked for null here, you dereferenced without checking there. • Some infinite loops • Dangling else and other indentation issues • Array index out of bounds • Integer overflow • checked arithmetic is off by default • Non-memory resource leaks • Such as forgetting to close a file • Stray semicolons • Swapped arguments • Unused return value • Uncaught exception • Missing or misordered critical sections • Including non-atomic operations inconsistently inside critical sections • And many more! And these are just a few that are common to C and C#; there are a whole host of defects specific to C# programs that we could find statically. Let’s consider the psychological aspects of static analysis tools beyond the compiler.
  32. Day one training at Coverity
  33. Developer Adoption is Key • Soundness is explicitly a non-goal • We don’t want to find all defects or even most defects • We want every defect reported to be a customer-affecting bug • Developers won’t adopt a product that they perceive as making their jobs harder for no customer benefit • Our business model requires adoption to drive renewals • How do developers – who, remember, are using C# because they like a statically-typed language – react to static analysis tools?
  34. Developer psychology WRT analysis tools
  35. Developer psychology WRT analysis tools • Egotistical • I don’t need this tool for my code • But my coworkers on the other hand… • Clever management uses this trait to advantage
  36. Developer psychology WRT analysis tools
  37. Developer psychology WRT analysis tools • Skeptical, conservative, dismissive • Resistant to change • Quick to criticize “stupid” false positives • The first five defects they see had better be true positives
  38. Developer psychology WRT analysis tools
  39. Developer psychology WRT analysis tools • “Busy” with, you know, “real work” • Code annotations are unacceptable • Analysis tool must adapt to customer’s build process • Overnight analysis runs are acceptable – barely
  40. Developer psychology WRT analysis tools
  41. Developer psychology WRT analysis tools • Any change in what defects are reported on the same code over time – a.k.a. “churn” – is the enemy • Randomized analysis is right out, unfortunately • Any improvement to our analysis heuristics can cause unwanted churn • We try to keep churn below 5% on every release
  42. Developer psychology WRT analysis tools
  43. Developer psychology WRT analysis tools • Responds well to perverse incentives • Hard-to-understand defect reports are easy to ignore • No downside to incorrectly triaging true positives as false positives • Finding defects is hard; presenting evidence that prevents incorrect classification as a false positive is harder • Deep analysis with theorem provers can be worse than shallow analysis with cheap heuristics. • Presenting the result is insufficient; the developer must understand the proof to fix the defect.
  44. Displaying good defect messages
  45. Displaying good defect messages public void GetThing(Type type, bool includeFrobs) { bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[] if (instance is IFrob && includeFrobs) { [...] } else if (type.IsAssignableFrom(instance.GetType()) { [...] }
  46. Displaying good defect messages public void GetThing(Type type, bool includeFrobs) { Assuming type is null. type != null evaluated to false. bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[] instance is IFrob evaluated to true. includeFrobs evaluated to false. if (instance is IFrob && includeFrobs) { [...] } Dereference after null check: dereferencing type while it is null. else if (type.IsAssignableFrom(instance.GetType()) { [...] }
  47. Management psychology
  48. Management psychology • The first time static analysis runs there may be thousands of errors; typical rate is one defect per thousand LOC • Academic answer: rank heuristics • Pragmatic answer: ignore them all • Simply ignore all defects in existing code • Triage and fix defects in new code • “Someday” get around to fixing defects in old code • Why is this so popular? • Old code is in the field. It works well enough. Risk is low. • New code is unproven. It might work, or it might not. Risk is high.
  49. Management psychology
  50. Management psychology • Management actually pays for the developer tools • And typically has no idea how to use them effectively • Middle management has perverse incentives too • Time, cost and complexity are easily measured; quality is not • “Never upgrade the static analysis tool before release” • Worse tools are better; better tools are worse
  51. Worse is better; better is worse KnownDefects Time No tool improvements == Management gets bonus
  52. Worse is better; better is worse KnownDefects Time No tool improvements == Management gets bonus Tool upgrades find more defects == Management gets no bonus The fix rate is the same in these two graphs but if the tool improves faster than the fix rate, no bonus.
  53. Good news If you have a well-engineered product that: • makes good use of theoretical and pragmatic approaches, • finds real-world, user-affecting defects, and • takes developer and management psychology into account Then you can make a positive difference
  54. Conclusion
  55. Special thanks to Scott at
  56. Conclusion
  57. Conclusion • Theoretical static analysis techniques are awesome; we can and do use them in industry… • … but doing all that math is actually only one small part of shipping a static analysis product • Understanding developer and management psychology is necessary to ensure adoption of any developer tools • C# was carefully designed to match a target developer mindset • Coverity thinks about developer and manager psychology at every stage in the analysis and overall product design • Research into better ways to present defects would be awesome
  58. More information • Learn about Coverity at • Read “A Few Billion Lines Of Code Later” • Find me on Twitter at @ericlippert • Or read my C# blog at • Or ask me about C# at
  59. Copyright 2013 Coverity, Inc.