Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Splint the C code static checker

15,361 views

Published on

A presentation about splint, the C code static checker.

Published in: Technology
  • Hello. I would invite all who are interested in static code analysis, try our tool PVS-Studio.
    PVS-Studio is a static analyzer that detects errors in source code of C/C++/C++11 applications (Visual Studio 2005/2008/2010).
    Examples of use PVS-Studio:
    100 bugs in Open Source C/C++ projects
    http://www.viva64.com/en/a/0079/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Splint the C code static checker

  1. 1. Splint the C code static checker Pedro Pereira Ulisses Costa Formal Methods in Software Engineering May 28, 2009 Pedro Pereira, Ulisses Costa Splint the C code static checker
  2. 2. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  3. 3. Lint for detecting anomalies in C programs Statically checking C programs Unused declarations Type inconsistencies Use before definition Unreachable code Ignored return values Execution paths with no return Infinite loops Pedro Pereira, Ulisses Costa Splint the C code static checker
  4. 4. Splint Specification Lint and Secure Programming Lint Annotations Functions Variables Parameters Types Pedro Pereira, Ulisses Costa Splint the C code static checker
  5. 5. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  6. 6. Unused variables Splint detects instances where the value of a location is used before it is defined. Annotations can be used to describe what storage must be defined and what storage may be undefined at interface points. All storage reachable is defined before and after a function call. global variable parameter to a function function return value Pedro Pereira, Ulisses Costa Splint the C code static checker
  7. 7. Undefined Parameters Sometimes, function parameters or return values are expected to reference undefined or partially defined storage. out annotation denotes a pointer to storage that may be undefined in annotation can be used to denote a parameter that must be completely defined 1 extern void setVal (/*@out@*/ int * x ) ; 2 extern int getVal (/*@in@*/ int * x ) ; 3 extern int mysteryVal ( int * x ) ; > splint usedef . c 4 usedef . c :7: Value * x used before 5 int dumbfunc (/*@out@*/ int *x , int i ) { definition 6 if ( i > 3) usedef . c :9: Passed storage x not 7 return * x ; completely defined 8 else if ( i > 1) (* x is undefined ) : getVal ( x ) 9 return getVal ( x ) ; usedef . c :11: Passed storage x not 10 else if ( i == 0) completely defined 11 return mysteryVal ( x ) ; (* x is undefined ) : mysteryVal 12 else { (x) 13 setVal ( x ) ; Finished checking --- 3 code warnings 14 return * x ; 15 } 16 } Pedro Pereira, Ulisses Costa Splint the C code static checker
  8. 8. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  9. 9. Types Strong type checking often reveals programming errors. Splint can check primitive C types more strictly and flexibly than typical compilers. Built in C Types Splint supports stricter checking of built-in C types. The char and enum types can be checked as distinct types, and the different numeric types can be type-checked strictly. Characters The primitive char type can be type-checked as a distinct type. If char is used as a distinct type, common errors involving assigning ints to chars are detected. If charint is on (+), char types are indistinguishable from ints. Pedro Pereira, Ulisses Costa Splint the C code static checker
  10. 10. Types - Enums An error is reported if: a value that is not an enumerator member is assigned to the enum type if an enum type is used as an operand to an arithmetic operator If the enumint flag is on, enum and int types may be used interchangeably. Pedro Pereira, Ulisses Costa Splint the C code static checker
  11. 11. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  12. 12. Memory management About half the bugs in typical C programs can be attributed to memory management problems. Some only appear sporadically And some may only be apparent when compiled on a different platform Splint detects many memory management errors at compile time Using storage that may have been deallocated Memory leaks Returning a pointer to stack-allocated storage Pedro Pereira, Ulisses Costa Splint the C code static checker
  13. 13. Memory management - Memory Model An object is a typed region of storage; Some objects use a fixed amount of storage (that is allocated and deallocated by the compiler); Other objects use dynamic memory storage that must be managed by the program. Storage is undefined if it has not been assigned a value and defined after it has been assigned a value. An object is completely defined if all storage that may be reached from it is defined. Pedro Pereira, Ulisses Costa Splint the C code static checker
  14. 14. Memory management - Memory Model (cont.) What storage is reachable from an object depends on the type and value of the object. Example If p is a pointer to a structure, p is completely defined if the value of p is NULL, or if every field of the structure p points to is completely defined. Pedro Pereira, Ulisses Costa Splint the C code static checker
  15. 15. Memory management - Memory Model (cont.) Left side of an assignment When an expression is used as the left side of an assignment we say it is an lvalue; Its location in memory is used, but not its value; Undefined storage may be used as an lvalue since only its location is needed. Right side of an assignment When storage is used in any other way: on the right side of an assignment; as an operand to a primitive operator; as a function parameter. we say it is used as an rvalue; It is an anomaly to use undefined storage as an rvalue. Pedro Pereira, Ulisses Costa Splint the C code static checker
  16. 16. Memory management - Deallocation Errors Deallocating storage when there are other live references to the same storage Failing to deallocate storage before the last reference to it is lost Solution Obligation to release storage This obligation is attached to the reference to which the storage is assigned The only annotation is used to indicate that a reference is the only pointer to the object it points to: 1 /* @only@ */ /* @null@ */ void * malloc ( size_t size ) ; Pedro Pereira, Ulisses Costa Splint the C code static checker
  17. 17. Memory management - Memory Leaks > splint only . c 1 extern /* @only@ */ int * glob ; only . c :4: Only storage glob ( type int *) 2 not released before assignment : glob = y 3 /* @only@ */ int * f ( /* @only@ */ only . c :1: Storage glob becomes only int *x , int *y , int * z ) { only . c :4: Implicitly temp storage y 4 int * m = ( int *) malloc ( assigned to only : glob = y sizeof ( int ) ) ; only . c :6: Dereference of possibly null 5 glob = y ; // Memory leak pointer m : * m only . c :8: Storage m may become null 6 free ( x ) ; only . c :6: Variable x used after being 7 *m = *x; // Use after released free only . c :5: Storage x released only . c :7: Implicitly temp storage z 8 return z ; // Memory leak returned as only : z detected only . c :7: Fresh storage m not released 9 } before return only . c :3: Fresh storage m allocated Pedro Pereira, Ulisses Costa Splint the C code static checker
  18. 18. Memory management - Stack References A memory error occurs if a pointer into stack is live after the function returns Splint detects errors involving stack references exported from a function through return values or assignments to references reachable from global variables or actual parameters No annotations are needed to detect stack reference errors. It is clear from declarations if storage is allocated on the function stack. 1 int * glob ; > splint stack . c 2 stack . c :9: Stack - allocated storage & loc 3 int * f ( int ** x ) { reachable from return value : & loc 4 int sa [2] = { 0 , 1 }; stack . c :9: Stack - allocated storage * x 5 int loc = 3; reachable from 6 parameter x stack . c :8: Storage * x becomes stack 7 glob = & loc ; stack . c :9: Stack - allocated storage glob 8 * x = & sa [0]; reachable 9 return & loc ; from global glob stack . c :7: Storage glob becomes stack 10 } Pedro Pereira, Ulisses Costa Splint the C code static checker
  19. 19. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  20. 20. Control Flow - Execution Many of these checks are possible because of the extra information that is known in annotations To avoid spurious errors it is important to know something about the behaviour of called functions Without additional information Splint assumes that all functions return and execution continues normally Pedro Pereira, Ulisses Costa Splint the C code static checker
  21. 21. Control Flow - Execution (cont.) noreturn annotation is used to denote a function that never returns. 1 extern /* @noreturn@ */ void fatalerror ( char * s ) ; Problem! We also have maynoreturn and alwaysreturns annotations, but Splint must assume that a function returns normally when checking the code and doesn’t verify if a function really returns. Pedro Pereira, Ulisses Costa Splint the C code static checker
  22. 22. Control Flow - Execution (cont.) To describe non-returning functions the noreturnwhentrue and noreturnwhenfalse mean that a function never returns if the first argument is true or false. 1 /* @ n o r e t u r n w h e n f a l s e @ */ void assert ( /* @sef@ */ bool /* @alt int@ */ pred ) ; The sef annotation denotes a parameter as side effect free The alt int indicate that it may be either a Boolean or an integer Pedro Pereira, Ulisses Costa Splint the C code static checker
  23. 23. Control Flow - Undefined Behavior The order which side effects take place in C is not entirely defined by the code. Sequence point a function call (after the arguments have been evaluated) at the end of a if, while, for or do statement a &&, || and ? Pedro Pereira, Ulisses Costa Splint the C code static checker
  24. 24. Control Flow - Undefined Behavior (cont.) > splint order . c + evalorderuncon order . c :5: Expression has undefined 1 extern int glob ; behavior ( value of right operand modified by left operand ) : 2 extern int mystery ( void ) ; x ++ * x 3 extern int modglob ( void ) /* order . c :6: Expression has undefined @globals glob@ */ /* behavior ( left operand uses i , modified by right operand ) : y [ i ] @modifies glob@ */ ; = i ++ 4 int f ( int x , int y []) { order . c :7: Expression has undefined 5 int i = x ++ * x ; behavior ( value of right operand modified by left operand ) : 6 y [ i ] = i ++; modglob () * glob 7 i += modglob () * glob ; order . c :8: Expression has undefined 8 i += mystery () * glob ; behavior ( unconstrained function mystery used in 9 return i ; left operand 10 } may set global variable glob used in right operand ) : mystery () * glob Pedro Pereira, Ulisses Costa Splint the C code static checker
  25. 25. Control Flow - Likely Infinite Loops Splint reports an error if it detects a loop that appears to be inifinite. An error is reported for a loop that does not modify any value used in its condition test inside the body of the loop or in the condition test itself. 1 extern int glob1 , glob2 ; 2 extern int f ( void ) /* @globals glob1@ */ /* @modifies > splint loop . c + infloopsuncon loop . c :7: Suspected infinite loop . No nothing@ */ ; value used in 3 extern void g ( void ) /* loop test (x , glob1 ) is modified by test or loop @modifies glob2@ */ ; body . 4 extern void h ( void ) ; loop . c :8: Suspected infinite loop . No 5 condition values modified . Modification possible 6 void upto ( int x ) { through 7 while ( x > f () ) g () ; unconstrained calls : h 8 while ( f () < 3) h () ; 9 } Pedro Pereira, Ulisses Costa Splint the C code static checker
  26. 26. Control Flow - Switches Splint detects case statements with code that may fall through to the next case. The casebreak flag controls reporting of fall through cases. The keyword fallthrough explicitly indicates that execution falls through to this case. 1 typedef enum { 2 YES , NO , DEFINITELY , 3 PROBABLY , MAYBE } ynm ; 4 5 void decide ( ynm y ) { 6 switch ( y ) { > splint switch . c 7 case PROBABLY : switch . c :9: Fall through case ( no 8 case NO : printf ( quot; No ! quot; ) ; preceding break ) switch . c :12: Missing case in switch : 9 case MAYBE : printf ( quot; DEFINITELY Maybe quot; ) ; 10 /* @fallthrough@ */ 11 case YES : printf ( quot; Yes ! quot; ); 12 } 13 } Pedro Pereira, Ulisses Costa Splint the C code static checker
  27. 27. Control Flow - Conclusion But Splint has more! Deep Breaks Complete Logic Pedro Pereira, Ulisses Costa Splint the C code static checker
  28. 28. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  29. 29. Buffer sizes 1 Buffer overflow errors are a particularly dangerous type of bug in C 2 They are responsible for half of all security attacks 3 C does not perform runtime bound checking (for performance reasons) 4 Attackers can exploit program bugs to gain full access to a machine Pedro Pereira, Ulisses Costa Splint the C code static checker
  30. 30. Buffer sizes - Checking access Splint models blocks of memory using two properties: maxSet maxSet(b) denotes the highest address beyond b that can be safely used as lvalue, for instance: char buffer[MAXSIZE] we have maxSet(buffer ) = MAXSIZE − 1 maxRead maxRead(b) denotes the highest index of a buffer that can be safely used as rvalue. When a buffer is accessed as an lvalue, Splint generates a precondition constraint involving the maxSet property When a buffer is accessed as an rvalue, Splint generates a precondition constraint involving the maxRead property Pedro Pereira, Ulisses Costa Splint the C code static checker
  31. 31. Buffer sizes - Annotating Buffer Sizes 1 Function declarations may include requires and ensures clauses to specify assumptions about buffer sizes for function preconditions 2 When a function with requires clause is called, the call site must be checked to satisfy the constraints implied by requires 3 If the +checkpost is set, Splint warns if it cannot verify that a function implementation satisfies its declared postconditions Pedro Pereira, Ulisses Costa Splint the C code static checker
  32. 32. Buffer sizes - Annotating Buffer Sizes (cont.) 1 void /* @alt char * @ */ strcpy 2 ( /* @unique@ */ /* @out@ */ /* @returned@ */ char * s1 , char * s2 ) 3 /* @modifies * s1@ */ 4 /* @requires maxSet ( s1 ) >= maxRead ( s2 ) @ */ 5 /* @ensures maxRead ( s1 ) == maxRead ( s2 ) @ */ ; Pedro Pereira, Ulisses Costa Splint the C code static checker
  33. 33. Buffer sizes - Annotating Buffer Sizes (cont.) 1 void /* @alt char * @ */ strncpy 2 ( /* @unique@ */ /* @out@ */ /* @returned@ */ char * s1 , char * s2 , 3 size_t n ) 4 /* @modifies * s1@ */ 5 /* @requires maxSet ( s1 ) >= ( n - 1 ) ; @ */ 6 /* @ensures maxRead ( s2 ) >= maxRead ( s1 ) / maxRead ( s1 ) <= n ; @ */ ; Pedro Pereira, Ulisses Costa Splint the C code static checker
  34. 34. Buffer sizes - Warnings Bound checking is more complex than other checks done by Splint So, memory bound warnings contain extensive information about the unresolved constraint setChar . c :5:4: Likely out - of - bounds store : buf [10] 1 int buf [10]; Unable to resolve constraint : requires 9 2 buf [10] = 3; >= 10 needed to satisfy precondition : requires maxSet ( buf @ setChar . c :5:4) >= 10 Pedro Pereira, Ulisses Costa Splint the C code static checker
  35. 35. Buffer sizes - Warnings (cont.) > splint bounds . c + bounds + showconstraintlocation bounds . c :5: Possible out - of - bounds store : 1 void updateEnv ( char * str ) { strcpy ( str , tmp ) 2 char * tmp ; Unable to resolve constraint : requires maxSet ( str @ bounds . c :5) >= 3 tmp = getenv ( quot; MYENV quot; ) ; maxRead ( getenv (quot; MYENV quot;) @ bounds . c :3) 4 if ( tmp != NULL ) needed to satisfy precondition : 5 strcpy ( str , tmp ) ; requires maxSet ( str @ bounds . c :5) >= maxRead ( tmp @ bounds . c :5) 6 } derived from strcpy precondition : requires maxSet ( < parameter 1 >) >= maxRead ( < parameter 2 >) Pedro Pereira, Ulisses Costa Splint the C code static checker
  36. 36. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  37. 37. The Ultimate Test: wu-ftpd wu-ftpd version 2.5.0 20.000 lines of code Took less than four seconds to check all of wu-ftpd on a 1.2-GHz Athlon machine Splint detected the known flaws as well as finding some previously unknown flaws (!) Pedro Pereira, Ulisses Costa Splint the C code static checker
  38. 38. The Ultimate Test: wu-ftpd (cont.) Running Splint on wu-ftpd without adding annotations produced 166 warnings for potential out-of-bounds writes After adding 66 annotations, it produced 101 warnings: 25 of these indicated real problems and 76 were false Pedro Pereira, Ulisses Costa Splint the C code static checker
  39. 39. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  40. 40. Pros and Cons Pros Lightweight static analysis detects software vulnerabilities Splint definately improves code quality Suitable for real programs... Cons . . . although it produces more warning messages that lead to confusion It won’t eliminate all security risks Hasn’t been developed since 2007, they need new volunteers Pedro Pereira, Ulisses Costa Splint the C code static checker
  41. 41. Sum´rio a 1 Introduction 2 Unused variables 3 Types 4 Memory management 5 Control Flow 6 Buffer sizes 7 The Ultimate Test: wu-ftpd 8 Pros and Cons 9 Conclusions Pedro Pereira, Ulisses Costa Splint the C code static checker
  42. 42. Conclusions No tool will eliminate all security risks Lightweight static analysis tools (Splint) play an important role in identifying security vulnerabilities Pedro Pereira, Ulisses Costa Splint the C code static checker
  43. 43. Questions ? Pedro Pereira, Ulisses Costa Splint the C code static checker

×