• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Static Analysis for PHP, from PHPDay Italy 2012
 

Static Analysis for PHP, from PHPDay Italy 2012

on

  • 18,352 views

Techniques for static analysis of PHP, include using Facebooks HipHop for PHP (HpHp) and ClamAV. First presented at PHPDay 2012, Verona, Italy, May 19, 2012. Companion source code is available at ...

Techniques for static analysis of PHP, include using Facebooks HipHop for PHP (HpHp) and ClamAV. First presented at PHPDay 2012, Verona, Italy, May 19, 2012. Companion source code is available at https://github.com/client9/hphp-tools

Statistics

Views

Total Views
18,352
Views on SlideShare
9,291
Embed Views
9,061

Actions

Likes
21
Downloads
59
Comments
6

22 Embeds 9,061

http://corporate.tuenti.com 8739
http://juiceforu.wordpress.com 163
http://cyberintruder.wordpress.com 50
http://www.techgig.com 20
https://twitter.com 14
http://radityopw.posterous.com 12
http://46.137.179.7 11
http://translate.googleusercontent.com 11
http://feedwrangler.net 8
http://localhost 8
http://www.linkedin.com 5
https://si0.twimg.com 5
https://twimg0-a.akamaihd.net 4
http://twitter.com 2
http://new.fluidinfo.com 2
https://www.linkedin.com 1
https://www.google.com 1
http://leapf.org 1
http://webcache.googleusercontent.com 1
https://www.google.es 1
http://beta.fluidinfo.com 1
https://cyberintruder.wordpress.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

16 of 6 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • interesting approach, thanks for sharing
    Are you sure you want to
    Your message goes here
    Processing…
  • And another handy link for trying to build HpHp:
    http://php.webtutor.pl/en/2011/04/22/howto-install-hiphop-for-php-on-centos-5-x/
    Are you sure you want to
    Your message goes here
    Processing…
  • @nickgsuperstar or enforce all .php files ending with '?>', with a single trailing \n allowed... it leads to a small amount of files ending with ?>, but at least it's very clear this was intentional



    also: great talk, thanks for all the useful info! :)
    Are you sure you want to
    Your message goes here
    Processing…
  • Someone asked on how can you prevent 'ending ?>' in PHP source code. These can be a problem since after the last '?>' might have some trailing whitespace or junk after it which could cause problems if the file is included by another. To detect this, you'll have to use something like CodeSniffer or another tokenizer and see if anything comes after the last ?>. Regexp or grep might not work.
    Are you sure you want to
    Your message goes here
    Processing…
  • Also https://github.com/client9/hphp-tools I just uploaded a sample stubs file https://github.com/client9/hphp-tools/blob/master/hphp_stubs.php.txt
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Static Analysis for PHP, from PHPDay Italy 2012 Static Analysis for PHP, from PHPDay Italy 2012 Presentation Transcript

  • Static Analysis for PHP PHPDay Verona, Italy 2012 Nick Galbreath @ngalbreath nickg@etsy.com
  • http://slidesha.re/ KzTfLygithub.com/client9/hphp-tools Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Static Analysis• Typically analyzes source code “at rest” for bugs, security problems, leaks, threading problems.• We’ll cover simple checks and HpHp• Some commercial tools exists too. Veracode runs off of PHP byte code http://www.veracode.com/products/static Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Dynamic Analysis Analysis of code while running• valgrind http://valgrind.org/• xdebug http://xdebug.org/• xhprof http://pecl.php.net/package/xhprof Great tools, but not for this talk.Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Simple Static Analysis
  • The Littlest Static Analysis php -l • Syntax errors should never be committed. • Syntax errors should never go to prod! • Make sure dev and prod versions of PHP are identical Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • PHP Leading Whitespacepre-commit check that every file starts witheither #! or <?php exactlyNick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • PHP Trailing Whitespace Check that file ends exactly with ?> or make sure it doesn’t have a closing tag. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Anti-Virus On Source Code• It’s static analysis too!• Not so concerned with PHP but do you have Javascript, Flash, Word, PowerPoint, PDFs, ZIPs in your source tree? Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • ClamAV• http://www.clamav.net/• Free anti-virus.• Available on every OS.
  • ClamAV Performance 1G of Source Code / Minute Why not do it?
  • Advanced Static Analysis
  • Why not use... AST?http://docs.php.net/manual/en/function.token-get-all.php • token_get_all($file) takes a file and returns an Abstract Syntax Tree in php. • Orders of magnitude slower -- can’t use for pre-commit check on large code bases • Too low level -- need to turn it into an intermediate representation. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Why not use... CodeSniffer? http://pear.php.net/package/PHP_CodeSniffer• Excellent tool, but...• Based on token_get_all• SAX-style API• Too slow for pre/post commit hooks Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Why not use...• php-SAT:Orphaned 2009• php-AST: Orphaned 2008• phc: active but doesn’t support... OBJECTS• Every other PHP to Java translator or converter is orphaned or has other problems Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Facebook’s HpHp• A full re-implementation of Apache+PHP• Compiles PHP to it’s own byte code format and executes in own runtime.• May also translate to C++ for other compilation or use JIT• Does type-inference for speed-ups• Also includes a HTTP web server Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Bad News #1 No action since 2011-12-06Facebook appears to use “code drops”instead of true “streaming” open sourcemodel. BOO.Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Bad News #2Missing Many Common Modules• Has: apc, array, bcmath, bzip2, ctype, curl,iconv, gd, imap, ipc, json, ldap,math,mb, mcrypt,memcache, mysql, network, openssl, pdo, posix, preg, process, session, simplexml, soap, socket, slqite3, stream, string, thread, thrift, url, xml*, zlib.• That’s it! (No filter_var, no ftp, no ..) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Bad News #3Doesn’t Track PHP 5.3 PHP 5.4? No way!• Some functions signatures aren’t quite right. e.g. debug_print_backtrace • HpHp 2 arguments • PHP 5.3.6 3 arguments • PHP 5.4 4 arguments• (End up needing to whitelist this to ignore false positives) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Bad News #4 Seriously #*$%&!# annoying to build• My crappy CentOS build script https://github.com/client9/hphp-tools• Ubuntu users are slightly better off (see HpHp wiki)• Takes hours. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Bad News #5 Won’t help with Dynamic Evaluation$fn = “foo”;$fn(1,2,3); // function not foundeval(“foo(1,2,3)”); // no • This is more for runtime dynamic analysis. • Try to avoid this anyways. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Conclusion• You aren’t going to run your application under HpHp (at least not as is)• But, it has a great static analyzer that works and finds real bugs really fast.• Scans thousands of files in a few seconds Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Using HpHp
  • Step 1: Make a constants file• HpHp doesn’t know about hardwired constants• Nifty script generates the constants• May need to hand edit Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Step 2: Make a stubs file• HpHp doesn’t have many binary extensions• But... the analyzer doesn’t care. Just make a stub function. // http://php.net/manual/en/function.filter-var.php function filter_var($var, $filter=0, $options=NULL) { return $var; } You can make stub classes as well. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Step 3: Create the file list • Create a list of all php files to be analyzed and include your constants and stubs file. • Ignore phpunit and other tests • HpHp implements much of PHP base functionality as PHP code. (e.g. the Exception class is written in PHP). You need to add these system PHP as well Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • correction: grep -v helper.idl.php | grep -v constants.php >> $JUNK Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Step 4: Do itInclude paths are a bit mysterious.You’ll have to play around to get it right. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Step 5: Analyze it• /tmp/hphp/Stats.js contains some... statistics in JSON format.• /tmp/hphp/CodeError.js is were the good stuff is.• JSON format, includes: Error type, file, line number, code snippetNick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • UseUndeclaredVariable • #1 bug. • Typically typos, scoping or cut-n-paste errors • Found frequently in error handling casesif (!$ok) { error_log(“$user_id has a problem”); Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • TooManyArgumentTooFewArgumentToo Many Arguments typically indicates thecaller is confused and has logic errors (bug).Too Few Arguments is frequently a seriousbug as PHP silently fails and defaults to null.hash_hmac(‘sha1’, ‘foo’); // ooops no keyNick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • BadDefine UsesEvaluationdefine($k, $v);eval(“1+1”); • “Bad” since HpHp can’t compile it, but likely legal PHP. • Avoid using dynamic constant generation. Use configuration file instead. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • UseUndeclaredGlobalVariable • HpHp only defines certain globals. • Used only by Smarty? • $GLOBALS[HTTP_SERVER_VARS] • $GLOBALS[HTTP_SESSION_VARS] • $GLOBALS[HTTP_COOKIE_VARS] Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • UseVoidReturnSome function returns “nothing” but thevalue is usedfunction foo() { if (time() % 60 == 0) { return true; } // oops void}$now = foo(); // errorNick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • RequiredAfterOptionalParam function foo($first, $second=2, $third) { • IMHO should be a PHP syntax error • Confusing • (Oops, I haven’t investigated behavior) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • DeclaredConstantTwice• Probably not invalid PHP, but HpHp analyzes all files at once.• Best to have one file that defines constants or just not use them. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • UnknownFunction UnknownObjectMethod UnknownClass UnknownBaseClass• Is your file list complete?• Do you need to make stubs? Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • BadPHPIncludeFile Likely a PHP file trying to include/require itself or invalid file name or your autoloader is ambiguous. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • PHPIncludeFileNotFound • Really common • Probably unique to your autoloader. • Not sure I quite understand how HpHp computes file names and loads includes, requires, require_once... yet Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • HpHp at Etsy
  • Every Commit• Every commit gets checked in real-time• “try-server” also allows developers to test before committing.• Finds and prevents bugs before they go live every day.• Almost no false positives (!!)• Developers love it (especially the Java groups) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Analysis• CodeError.js is processed through a custom script.• Has a large blacklist of checks or files we don’t care about (3rd party, known bad, etc).• File and line info pass through to git blame to find author and date/time. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • hphp-try runs in Jenkins oops Console Output gives details Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Work in Progress• It took a lot of work to get the code base in shape so we could add pre-commit hook.• Over 200 real problems first identified.• We still have blacklisted some checks since we are still cleaning up legacy code (and figuring out how HpHp works) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Can We Do Better?
  • Checks aren’t that complicated• HpHp’s runtime type-inference isn’t used for static analysis (good since type-inference is hard)• All checks are fairly simple book-keeping.• All could be done in CodeSniffer/AST but too slow Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Slice off HpHp?• The HpHp Runtime is nice, but really complicated and a moving target.• Can we slice out the analysis part of HpHp?• Much simpler to build, easier to hack on. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Or Build New?• Can this run off “byte code” or hook into the parsing step of PHP?• Exec a snippet of PHP for the loading script files ?• Seems feasible Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Acknowledgments and References
  • Thanks• The Facebook Team!• Sebastian Bergman who first blogged about using HpHp for static analysis• Rasmus who first hacked up a version of HpHp in house at Etsy• The QA and DevTools teams at Etsy• All the Etsy developers who had some painful weeks getting the code in shape! Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Facebook References• https://github.com/facebook/hiphop-php Main source repo + wiki• http://developers.facebook.com/blog/post/ 2010/02/02/hiphop-for-php--move-fast/ Main announcement, 2010-02-02• https://www.facebook.com/note.php? note_id=416880943919 Update 2012-08-13 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Notes from Sebastian Bergman• http://sebastian-bergmann.de/archives/894- Using-HipHop-for-Static-Analysis.html Static Analysis Intro, 2010-07-27• http://sebastian-bergmann.de/archives/918- Static-Analysis-with-HipHop-for-PHP.html Tool to help process output, 2012-01-27 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Misc References• http://arstechnica.com/business/2011/12/ facebook-looks-to-fix-php-performance- with-hiphop-virtual-machine/ ArsTechnica overview, 2011-12-13• http://www.serversidemagazine.com/news/ 10-questions-with-facebook-research- engineer-andrei-alexandrescu/ Lots of good stuff in here, 2012-01-29 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • This Talk• These slides are posted at http://slidesha.re/KzTfLy• Tools for building on CentOS https://github.com/client9/hphp-tools• More about Nick Galbreath http://client9.com/ Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • Nick Galbreath nickg@etsy.com @ngalbreath PHPDay Verona Italy May 19, 2012 http://2012.phpday.it/