HipHop Virtual Machine


Published on

Presentation about the HipHop Virtual Machine given at Pentalog's headquarters in Cluj-Napoca on the 20th February 2014.

Published in: Technology
  • @popra: HHVM actually supports the entire PHP 5.4 specification. But, for the sake of completeness, take a look at this file: . The inconsistencies mentioned there are fairly rare, you won't usually run into them. The graph you saw in the presentation is the latest one; the reasons the parity is not 100 % on those frameworks is unknown to me. I searched for days on blogs, websites etc. and couldn't find anything about this. Parity fixes are done on each release (every 8 weeks), so if if makes you feel safer, you could wait another couple of months before using HHVM in production.
    Are you sure you want to  Yes  No
    Your message goes here
  • Do you know what's the parity for Symfony now? And if the missing stuff is critical?
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • Alteratives: simpletest, behat
  • HipHop Virtual Machine

    1. 1. Agenda  Introduction  What is HipHop VM ?  History and why it exists  Architecture and Features  General Architecture  Code cache  JIT  Garbage Collector  AdminServer  FastCGI  Extensions  HHVM-friendly PHP code  Parity
    2. 2. What is HipHop VM ?  High-Level Stack-Based virtual machine that executes PHP code  Created by Facebook in a (successful) attempt to reduce load on their servers  New versions are released every 8 weeks on Thursday. 10 days before a release, the branch is cut and heavily tested.
    3. 3. History of HHVM (I)  Summer 2007: Facebook started developing HPHPc, an PHP to C++ translator.  It worked by:  Building an AST based on the PHP code  Based on that AST, equivalent C++ code was generated  The C++ code was compiled to binary using g++  The binary was uploaded to the webservers where it was executed  This resulted in significant performance improvements, up to 500% in some cases compared to PHP 5.2
    4. 4. History of HHVM (II)  The succes of HPHPc was so great, that the engineers decided to give it a developer-friendly brother: HPHPi  HPHPi was just like HPHPc but it ran in interpreted mode only (a.k.a. much slower)  However, it provided a lot of utilities for developers:  Debugger (known as HPHPd)  Setting watches, breakpoints  Static code analysis  Performance profiling  It also didn’t require the compilation step to run the code  HPHPc ran over 90 % of FB production code by the end of 2009  HPHPc was open-sourced on February 2010
    5. 5. History of HHVM (III)  But good performance came at a cost:  Static compilation was very cumbersome  The binary had 1 GB which was a problem since production code had to be pushed to the servers DAILY  Maintaining compatibility between HPHPc and HPHPi was getting more and more difficult (they used different formats for their ASTs)  So, at the beginning of 2010, FB started developing HHVM, which was a better, longer-term solution  At first, HHVM replaced only HPHPi, while HPHPc remained in production  But now, all FBs production servers are run by HHVM  FB claims a 3x to 10x speed boost and 0.5x – 5x memory reduction compared to PHP + APC. This, of course, is on their own code, most applications will have a more modest improvement
    6. 6. General Architecture (I)  General architecture is made up of:  2 webservers  A translator  A JIT compiler  A Garbage Collector  HHVM doesn’t support any OS:  It supports most flavours of Linux  It has some support for Mac OS X (only runs with JIT turned off )  There is no Windows support  The OS must have a 64-bit architecture in order for HHVM to work
    7. 7. General Architecture (II)  The HHVM will follow the following steps to execute a PHP script:  Based on PHP code, build an AST (implementation for this was     reused from HPHPc) Based on the AST, build Hip Hop Bytecode (HHBC), similar to Java’s or CLR’s bytecode Cache the HHBC At runtime, pass the HHBC through the JIT compliler (if enabled) which will transform it to machine code Execute the machine code or, if JIT is disabled, execute the HHBC in interpreted mode (not as fast, but still faster than Zend PHP)
    8. 8. Code Cache (I)  When request comes in, HHVM determines which file to serve up, then checks if the file’s HHBC is in SQLite-based cache  If yes, it’s executed  If no, HHVM compiles it, optimizes it and stores it in cache  This is very similar to APC  There’s a warm-up period when new server is created, because cache is empty  However, HHVM’s cache lives on disk, so it survives server restarts and there will be no more warm-up periods for that file
    9. 9. Code Cache (II)  But warm-up period can be bypassed by doing pre-analysis  Pre-analysis means the cache can be generated before HHVM starts-up  Pre-analyser will actually work a little harder and will do a better job at optimizing code
    10. 10. Code Cache (III)  There is a mode called RepoAuthoritative mode  HHVM will check at each request if the PHP file changed in order to know if cache must be updated  RepoAuthoritative mode means this check is not performed anymore.  But be careful because, if the file is not in cache, you’ll get a HTTP 404 error, even though the PHP file is right there  RepoAuthoritative is recommended for production because it avoides a lot of disk IO and files change rarely anyway
    11. 11. JIT Compiler  Just-in-Time compilation is done during execution, not before  It translates an intermediate form of code (in this case HHBC) to machine code  A JIT compiler will constantly check to see which paths of code are executed more frequently and try to optimize those as best as possible  Since a JIT compiler will compile to machine code at runtime, the resulting machine code will be optimized for that platform or CPU, which will sometimes make it faster than even static compilation
    12. 12. JIT Compiler (II)  HHVM uses so called tracelets as basic unit block of JIT  A tracelet is usually a loop because most programs spend most of their time in some “hot loops” and subsequent iterations of those loops take similar paths  A tracelet has 3 parts:  Type guard(s): prevents execution for incompatible types  Body  Link to subsequent tracelet(s)  Each tracelet has great freedom, but it is required to restore the VM to a consistent state any time execution escapes  Tracelets have only ONE execution path, which means no control flow, which they’re easy to optimize
    13. 13. Garbage Collector  Most modern languages have automatic memory management  In the case of VMs, this is called Garbage Collector  There are 2 major types of GCs:  Refcounting: for each object, there is a count that constantly keeps track of how many references point to it  Tracing: periodically, during execution, the GC scans each object and determines if it’s reachable. If not, it deletes it  Tracing is easier to implement and more efficient, but PHP requires refcounting, so HHVM uses refcounting  FB engineers want to move to a tracing approach and they might get it done someday
    14. 14. AdminServer  HHVM will actually start 2 webservers:  Regular one on port 80  AdminServer on the port you specify  It can be accessed at an URI like http://localhost:9191/check-health?auth=mypasshaha  The AdminServer can turn JIT on/off, show statistics about traffic, queries, memcache, CPU load, number of active threads and many more
    15. 15. FastCGI  HHVM supports FastCGI starting with version 2.3.0 (released in December 2013)  FastCGI is a communication protocol used by webservers to communicate with other applications  The support for FastCGI means we don’t have to use HHVM’s poor webserver, but instead use something like Apache or nginx and let HHVM do what it does best: execute PHP code at lightning speed  Supporting FastCGI will make HHVM enter even more production systems and increase its popularity
    16. 16. Extensions  HHVM supports extensions just like PHP does  They can be written in PHP, C++ or a combination of the 2  Extensions will be loaded at each request, you don’t have to keep loading an extension all over your applications  To use custom extensions, you add it to the extensions and then recompile HHVM. The resulting binary will contain your extension and you can then use it  By default, HHVM already contains the most popular extensions, like MySQL, PDO, DOM, cURL, PHAR, SimpleXML, JSON, me mcache and many others  Though, it doesn’t include MySQLi at this time
    17. 17. HHVM-friendly Code (I)  Write code that HHVM can understand without running, code that contains as much static detail as possible  Avoid things like:  Dynamic function call: $function_name()  Dynamic variable name: $a = $$x + 1;  Functions like compact(), get_defined_vars(), extract() etc  Don't access dynamic properties of an object. If you want to access it, declare it. Accessing dynamic properties must use hashtable lookups, which are much slower.  Where possible, provide:  Type hinting in function parameters  Return type of functions should be as obvious as possible:
    18. 18. HHVM-friendly Code (II)  Code that runs in global scope is never JIT-ed.  Any code anywhere can mutate the variables in the global scope. So, since PHP is weak-typed, it makes it impossible for the JIT compiler to predict a variable’s type  Example:  class B { public function __toString() {  $GLOBALS['a'] = 'Hello, world !';  }  }  $a = 5;  $b = new B;  echo $b; 
    19. 19. Parity (I)  All this is great, but can HHVM actually run real-world code ? Well, in December 2013, it looked like this (taken from HHVM blog):
    20. 20. Parity (II)  HHVM’s engineers main goal is to be able to run all PHP frameworks by Q4 2014 or Q1 2015.
    21. 21. Q&A