Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Php extensions workshop

5,411 views

Published on

Published in: Software, Technology

Php extensions workshop

  1. 1. PHP5 Extensions Workshop
  2. 2. Good morning  Julien PAULI  PHP programmer for many years  PHP internals source hacker  5.5 and 5.6 Release Manager  Writes tech articles and books  http://www.phpinternalsbook.com  http://jpauli.github.io  Working at SensioLabs in Paris  Mainly doing cool C stuff on PHP / Symfony2  @julienpauli - github.com/jpauli - jpauli@php.net
  3. 3. The road  Compile PHP and use debug mode  PHP extensions details and lifetime  PHP extensions globals management  Memory management  PHP variables : zvals  PHP INI settings  PHP functions, objects and classes  Overwritting existing behaviors  Changing deep Zend Engine behaviors
  4. 4. What you should bring  A laptop under Linux/Unix  Good C knowledge  Linux knowledge (your-OS knowledge)  Any C dev environment  Those slides will assume a Debian based Linux  We won't bother with thread safety  You can just ignore those TSRM* things you'll meet
  5. 5. Compiling PHP, using debug  Grab a PHP source code from php.net or git  Install a C code compiling environment  You'll probably need some libs to compile PHP :$> apt-get install build-essential autoconf :$> apt-get install libxml2-dev
  6. 6. Compiling PHP, using debug  Compile a debug PHP and install it  Do not forget debug flag  Extensions compiled against debug PHP won't load on "normal" PHP => need recompile  Always develop extension under debug mode :$> ./configure --enable-debug --prefix=my/install/dir :$> make && make install :$> my/install/dir/bin/php -v PHP 5.5.6-dev (cli) (built: Nov 5 2013 17:33:45) (DEBUG) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.5.0, Copyright (c) 1998-2013 Zend Technologies
  7. 7. Create your first extension  There exists a skeleton generator, let's use it :$> cd phpsrc/ext :$phpsrc/ext> ./ext_skel --extname=extworkshop Creating directory ext-workshop Creating basic files: config.m4 config.w32 .svnignore extworkshop.c php_extworkshop.h CREDITS EXPERIMENTAL tests/001.phpt extworkshop.php [done]. :~> cd extworkshop && tree . |-- config.m4 |-- config.w32 |-- CREDITS |-- EXPERIMENTAL |-- extworkshop.c |-- extworkshop.php |-- php_extworkshop.h `-- tests `-- 001.phpt
  8. 8. Activate your first extension  config.m4 tells the build tools about your ext  Uncomment --enable if your extension is stand alone  Uncomment --with if your extension has dependencies against other libraries :~/extworkshop> vim config.m4 PHP_ARG_ENABLE(ext-workshop, whether to enable extworkshop support, [ --enable-extworkshop Enable extworkshop support]) PHP_NEW_EXTENSION(extworkshop, extworkshop.c, $ext_shared)
  9. 9. Compile and install your ext  phpize tool is under `php-install-dir`/bin  It's a shell script importing PHP sources into your ext dir for it to get ready to compile  It performs some checks  It imports the configure script  This will make you compile a shared object  For static compilation, rebuild main configure using buildconf script  Run phpize --clean to clean the env when finished :~/extworkshop> phpize && ./configure --with-php-config=/path/to/php-config && make install
  10. 10. API numbers  PHP Api Version is the num of the version of the internal API  ZendModule API is the API of the extension system  ZendExtension API is the API of the zend_extension system  ZEND_DEBUG and ZTS are about debug mode activation and thread safety layer activation  Those 5 criterias need to match your extension's when you load it  Different PHP versions have different API numbers  Extensions may not work cross-PHP versions Configuring for: PHP Api Version: 20121113 Zend Module Api No: 20121212 Zend Extension Api No: 220121212
  11. 11. Check your extension install :~> path/to/php -dextension=extworkshop.so -m [PHP Modules] bcmath bz2 ... extworkshop ... [Zend Modules] :~> path/to/php -dextension=extworkshop.so --re extworkshop Extension [ <persistent> extension #51 extworkshop version 0.1.0 ] { - Functions { Function [ <internal:extworkshop> function confirm_extworkshop_compiled ] { } } }
  12. 12. What extensions can do  Extensions can :  Add new functions, classes, interfaces  Add and manage php.ini settings and phpinfo() output  Add new global variables or constants  Add new stream wrappers/filters, new resource types  Overwrite what other extensions defined  Hook by overwriting global function pointers  Extensions cannot :  Modify PHP syntax  Zend extensions :  Are able to hook into OPArrays (very advanced usage)
  13. 13. Why create an extension ?  Bundle an external library code into PHP  redis, curl, gd, zip ... so many of them  Optimize performances by adding features  C is way faster than PHP  C is used everywhere in Unix/Linux, including Kernel  Create your own C structures and manage them by providing PHP functions  Create your own resource intensive algorithms  Exemple : https://github.com/phadej/igbinary
  14. 14. C vs PHP  Don't try to turn the world to C  Why you should use PHP over C :  C is way more difficult to develop than PHP  C is less maintainable  C can be really tricky to debug  C is platform dependant. CrossPlatform can turn to PITA  Cross-PHP-Version is a pain  Why you should use C over PHP :  Bundle an external lib into PHP (cant be done in PHP)  Looking for very high speed and fast/efficient algos  Changing PHP behavior deeply, make it do what you want
  15. 15. PHP and extensions lifetime
  16. 16. Extensions lifetime
  17. 17. Extension lifetime through code zend_module_entry extworkshop_module_entry = { STANDARD_MODULE_HEADER, "extworkshop", extworkshop_functions, PHP_MINIT(extworkshop), PHP_MSHUTDOWN(extworkshop), PHP_RINIT(extworkshop), PHP_RSHUTDOWN(extworkshop), PHP_MINFO(extworkshop), PHP_EXTWORKSHOP_VERSION, STANDARD_MODULE_PROPERTIES };
  18. 18. Memory Management
  19. 19. Reminders  Memory alloc main pools :  Stack  Heap  Memory alloc classes :  auto  static  dynamic  Memory check tools :  Zend Memory Manager  Valgrind / electric fence
  20. 20. Zend Memory Manager API  Request-lifetime heap memory should be reclaimed using ZMM API  Infinite lifetime memory can be reclaimed using ZMM "persist" API, or direct libc calls #define emalloc(size) #define safe_emalloc(nmemb, size, offset) #define efree(ptr) #define ecalloc(nmemb, size) #define erealloc(ptr, size) #define safe_erealloc(ptr, nmemb, size, offset) #define erealloc_recoverable(ptr, size) #define estrdup(s) #define estrndup(s, length) #define zend_mem_block_size(ptr)
  21. 21. ZMM help  ZMM alloc functions track leaks for you  They help finding leaks and overwrites  If PHP is built with --enable-debug  If report_memleaks is On in php.ini (default)  Always use ZMM alloc functions  Don't hesitate to use valgrind to debug memory  USE_ZEND_ALLOC=0 env var disables ZendMM
  22. 22. PHP Variables : zval
  23. 23. Zval intro  Zval is the basis of every extension typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct { /* char *val; string value (binary safe) int len; string length } str; */ HashTable *ht; /* hash table value */ zend_object_value obj; /* object value */ } zvalue_value; struct _zval_struct { zvalue_value value; /* value */ zend_uint refcount__gc; /* refcount */ zend_uchar type; /* active type */ zend_uchar is_ref__gc; /* is_ref flag */ }; typedef struct _zval_struct zval;
  24. 24. Zval steps  There exists tons of macros helping you :  Allocate memory for a zval  Change zval type  Change zval real value  Deal with is_ref and refcount  Free zval  Copy zval  Return zval from PHP functions  ...
  25. 25. Zval main macros Memory Management ALLOC_ZVAL() INIT_ZVAL() ALLOC_INIT_ZVAL() MAKE_STD_ZVAL() zval_ptr_dtor() zval_dtor() Play with refcount/is_ref Z_REFCOUNT() Z_ADDREF() Z_DELREF() Z_ISREF() Z_SET_ISREF() Z_UNSET_ISREF() Z_SET_REFCOUNT() Copy / separate ZVAL_COPY_VALUE() ZVAL_ZVAL() INIT_PZVAL_COPY() zval_copy_ctor() MAKE_COPY_ZVAL() SEPARATE_ZVAL() SEPARATE_ZVAL_IF_NOT_REF() SEPARATE_ZVAL_TO_MAKE_IS_REF()Type jugling Z_TYPE() ZVAL_BOOL() ZVAL_STRING() ZVAL_STRINGL() ZVAL_EMPTY_STRING() ZVAL_TRUE() ZVAL_FALSE() ZVAL_LONG() ZVAL_DOUBLE() ZVAL_RESOURCE()
  26. 26. Zval and pointers  You'll manipulate, basically :  zval : use MACRO()  zval* : use MACRO_P()  zval ** : use MACRO_PP()  Read macros expansions  Use your IDE Play with pointers Z_ADDREF(myzval) Z_ADDREF_P(myzval *) Z_ADDREF_PP(myzval **)
  27. 27. Eclipse macro debugger
  28. 28. Zval Types  long = 4 or 8 bytes (depends on OS and arch)  Use SIZEOF_LONG macro to know  double = 8 bytes IEEE754  strings are binary safe  They are NUL terminated, but may encapsulate NULs  They so embed their size as an int  size = number of ASCII chars without ending NUL  Many macros to take care of them  Bools are stored as a long (1 or 0)  Resources are stored as a long (ResourceId)  Arrays = HashTable type (more later)  Objects = lots of things involved (more later)
  29. 29. Zval Types macros  When you want to read or write a Zval, you use once again dedicated macros : zval *myval; ALLOC_INIT_ZVAL(myval); ZVAL_DOUBLE(myval, 16.3); ZVAL_BOOL(myval, 1); ZVAL_STRINGL(myval, "foo", sizeof("foo")-1, 1); ZVAL_EMPTY_STRING(myval); printf("%*s", Z_STRLEN_P(myval), Z_STRVAL_P(myval)); printf("%ld", Z_LVAL_P(myval)); ...
  30. 30. Zval type switching  You can ask PHP to switch from a type to another, using its known internal rules  Those functions change the zval*, returning void  Use the _ex() API if separation is needed convert_to_array() convert_to_object() convert_to_string() convert_to_boolean() convert_to_null() convert_to_long()
  31. 31. Zval gc info  is_ref  Boolean value 0 or 1  1 means zval has been used as a reference  Engine behavior changed when is_ref = 1  Copy or not copy ? The answer is changed if is_ref=1  Refcount  Represents the number of different places where this zval is used in PHP (in the entire process)  1 = you are the only one using it (hopefully)  >1 = the zval is used elsewhere  Take care not to modify it if not wanted (separate it before)  Bad refcount management leads to crashes or leaks
  32. 32. Creating and destroying zvals  Creation = allocation + gc initialization  Can be done at once (+ sets value to IS_NULL)  Destruction = decrement refcount and if refcount reaches 0 : free  Tip : Don't try to use malloc() manually for zvals  Tip : Don't try to free your zval manually zval *myval; ALLOC_ZVAL(myval); INIT_PZVAL(myval); zval *myval; ALLOC_INIT_ZVAL(myval); zval_ptr_dtor(&zval_ptr);
  33. 33. PHP functions
  34. 34. Welcome PHP function typedef struct _zend_function_entry { const char *fname; void (*handler)(INTERNAL_FUNCTION_PARAMETERS); const struct _zend_arg_info *arg_info; zend_uint num_args; zend_uint flags; } zend_function_entry;
  35. 35. PHP functions  Each extension may register a zend_function_entry array zend_module_entry extworkshop_module_entry = { STANDARD_MODULE_HEADER, "extworkshop", extworkshop_functions, PHP_MINIT(extworkshop), PHP_MSHUTDOWN(extworkshop), PHP_RINIT(extworkshop), PHP_RSHUTDOWN(extworkshop), PHP_MINFO(extworkshop), ... ... static zend_function_entry extworkshop_functions[] = { PHP_FE(my_function, NULL) PHP_FE_END };
  36. 36. PHP functions declaration  Two macros :  PHP_FE() (php function entry)  PHP_FUNCTION() static zend_function_entry extworkshop_functions[] = { PHP_FE(my_function, NULL) PHP_FE_END }; PHP_FUNCTION(my_function) { /* do something */ } PHP_FE(function_name, function_arginfo) void zif_my_function(int ht, zval *return_value, zval **return_value_ptr, zval *this_ptr, int return_value_used , void ***tsrm_ls) PHP_FE_END { ((void *)0), ((void *)0), ((void *)0), 0, 0 } PHP_FUNCTION(my_function) zif_my_function
  37. 37. Function exercise  Declare two new functions  celsius_to_fahrenheit  fahrenheit_to_celsius  They should just be empty for the moment  Confirm all works
  38. 38. Functions deeper  Recall the signature of a function :  ht: number of arguments passed to the function  return_value: zval to feed with your data  Allocation is already done by the engine  return_value_ptr: used for return-by-ref functions  this_ptr: pointer to $this for OO context  return_value_used: set to 0 if the return value is not used ( f.e , a call like foo(); )  tsrm_ls: thread local storage slot PHP_FE(function_name, function_arginfo) void zif_my_function(int ht, zval *return_value, zval **return_value_ptr, zval *this_ptr, int return_value_used , void ***tsrm_ls)
  39. 39. Functions: accepting arguments  A very nice API exists  Have a look at phpsrc/README.PARAMETER_PARSING_API  zend_parse_parameters() converts arguments to the type you ask  Follows PHP rules  zend_parse_parameters() short : "zpp" zend_parse_parameters(int num_args_to_parse, char* arg_types, (va_arg args...))
  40. 40. Playing with zpp  zpp returns FAILURE or SUCCESS  On failure, you usually return, the engine takes care of the PHP error message  You always use pointers to data in zpp PHP_FUNCTION(foo) { long mylong; if (zend_parse_parameters(ZEND_NUM_ARGS(), "l", &mylong) == FAILURE) { return; } RETVAL_LONG(mylong); }
  41. 41. zpp formats a - array (zval*) A - array or object (zval *) b - boolean (zend_bool) C - class (zend_class_entry*) d - double (double) f - function or array containing php method call info (returned as zend_fcall_info and zend_fcall_info_cache) h - array (returned as HashTable*) H - array or HASH_OF(object) (returned as HashTable*) l - long (long) L - long, limits out-of-range numbers to LONG_MAX/LONG_MIN (long) o - object of any type (zval*) O - object of specific type given by class entry (zval*, zend_class_entry) p - valid path (string without null bytes in the middle) and its length (char*, int) r - resource (zval*) s - string (with possible null bytes) and its length (char*, int) z - the actual zval (zval*) Z - the actual zval (zval**) * - variable arguments list (0 or more) + - variable arguments list (1 or more)
  42. 42. zpp special formats | - indicates that the remaining parameters are optional, they should be initialized to default values by the extension since they will not be touched by the parsing function if they are not passed to it. / - use SEPARATE_ZVAL_IF_NOT_REF() on the parameter it follows ! - the parameter it follows can be of specified type or NULL. If NULL is passed and the output for such type is a pointer, then the output pointer is set to a native NULL pointer. For 'b', 'l' and 'd', an extra argument of type zend_bool* must be passed after the corresponding bool*, long* or double* arguments, respectively. A non-zero value will be written to the zend_bool iif a PHP NULL is passed.
  43. 43. zpp examples char *name, *value = NULL, *path = NULL, *domain = NULL; long expires = 0; zend_bool secure = 0, httponly = 0; int name_len, value_len = 0, path_len = 0, domain_len = 0; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|slssbb", &name, &name_len, &value, &value_len, &expires, &path, &path_len, &domain, &domain_len, &secure, &httponly) == FAILURE) { return; } /* Gets an object or null, and an array. If null is passed for object, obj will be set to NULL. */ zval *obj; zval *arr; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "o!a", &obj, &arr) == FAILURE) { return; }
  44. 44. Practice zpp  make our temperature functions accept argument and return a true result  Parse the argument  Check RETVAL_**() macros, they'll help °C x 9/5 + 32 = °F (°F - 32) x 5/9 = °C
  45. 45. Writing a test  PHP's got a framework for testing itself and its extensions  Welcome "PHPT"  Learn more about it at http://qa.php.net/write-test.php
  46. 46. make test
  47. 47. Practice  Write some tests for our temperature functions
  48. 48. Generating errors  Two kinds :  Errors  Exceptions  For errors :  php_error_docref() : Sends an error with a docref  php_error() / zend_error() : Sends an error  For exceptions :  zend_throw_exception() : throws an exception php_error(E_WARNING, "The number %lu is too big", myulong); zend_throw_exception_ex(zend_exception_get_default(), 0, "%lu too big", myulong);
  49. 49. Practice errors  Create a function temperature_converter($value, $convert_type)  convert_type can only be 1 or 2  1 = F° to C°  2 = C° to F°  It should output an error if $convert_type is wrong  The function should return a string describing the scenario run  Have a look at php_printf() function to help echo temperature_converter(20, 2); "20 degrees celsius give 68 degrees fahrenheit" echo temperature_converter(20, 8); Warning: convert_type not recognized
  50. 50. A quick word on string formats  Know your libc's printf() formats  http://www.cplusplus.com/reference/cstdio/printf/  Always use right formats with well sized buffers  Lot's of PHP functions use "extra", internal implementation of libc's printf() called spprintf()  Error messages for example  Read spprintf.c and snprintf.h to know more about PHP specific formats, such as "%Z"  Lots of nice comments in those sources
  51. 51. Function argument declaration  zpp is clever enough to compute needed args  zpp uses ZEND_NUM_ARGS()  it may return FAILURE if number is incorrect PHP_FUNCTION(foo) { long mylong; if (zend_parse_parameters(ZEND_NUM_ARGS(), "l", &mylong) == FAILURE) { return; } } <?php foo(); Warning: foo() expects exactly 1 parameter, 0 given in /tmp/myext.php on line 3
  52. 52. Function args declaration  Try to use Reflection on your temperature functions $> php -dextension=extworkshop.so --rf temperature_converter
  53. 53. Function args declaration  You may help reflection knowing about accepted parameters  For this, you need to declare them all to the engine  The engine can't compute them by itself  Welcome "arginfos"
  54. 54. zend_arg_info typedef struct _zend_arg_info { const char *name; zend_uint name_len; const char *class_name; zend_uint class_name_len; zend_uchar type_hint; zend_uchar pass_by_reference; zend_bool allow_null; zend_bool is_variadic; } zend_arg_info; typedef struct _zend_function_entry { const char *fname; void (*handler)(INTERNAL_FUNCTION_PARAMETERS); const struct _zend_arg_info *arg_info; zend_uint num_args; zend_uint flags; } zend_function_entry;
  55. 55. Function args declaration ZEND_BEGIN_ARG_INFO_EX(arginfo_foo, 0, 0, 1) ZEND_ARG_INFO(0, "mylong") ZEND_END_ARG_INFO() static zend_function_entry extworkshop_functions[] = { PHP_FE(foo, arginfo_foo) PHP_FE_END };  Have a look at arginfo macros in zend_API.h
  56. 56. Function args declaration  Add some arginfo to our functions  Try to use Reflection again
  57. 57. HashTables (PHP arrays)
  58. 58. HashTable quickly  C noticeable structure  Lots of ways to implement them in C  Mostly lots of operations are O(1) with worst case O(n)  http://lxr.linux.no/linux+v3.12.5/include/linux/list.h#L560  Used everywhere, in every strong program  Implementation of PHP arrays  Keys can be numeric type or string type  Values can be any type
  59. 59. HashTables in a picture
  60. 60. Zend HashTables  zend_hash.c / zend_hash.h  HashTable struct  big API  Doubled : weither key is numeric or string  Can store any data, but we'll mainly use them to store zvals  A specific API exists to ease zval storing  HashTables are widely used into PHP  Not only PHP arrays, they are used internally everywhere  gdb functions in .gdbinit to help debugging them
  61. 61. Zend HashTables basics  1 - Allocate  2 - Initialize  3 - Fill-in values  4 - Operate: Retrieve / Compute / count  5 - Unset values (if needed)  6 - Destroy  7 - Free
  62. 62. Zend HashTable API  Hash size is rounded up to the next power of two  If size is exceeded, HashTable will automatically be resized, but at a (low) CPU cost  pHashFunction is not used anymore, give NULL  pDestructor is the destructor function  Will be called on each element when you remove it from the Hash  This is used to manage memory : usually free it  If you use zvals in your Hash, use ZVAL_PTR_DTOR as a destructor int zend_hash_init(HashTable *ht, uint nSize, hash_func_t pHashFunction, dtor_func_t pDestructor, zend_bool persistent);
  63. 63. Zend HashTable API example HashTable myht = {0}; zend_hash_init(&myht, 10, NULL, ZVAL_PTR_DTOR, 0); zval *myval = NULL; ALLOC_INIT_ZVAL(myval); ZVAL_STRINGL(myval, "Hello World", sizeof("Hello World") - 1, 1); if (zend_hash_add(&myht, "myvalue", sizeof("myvalue"), &myval, sizeof(zval *), NULL) == FAILURE) { php_error(E_WARNING, "Could not add value to Hash"); } else { php_printf("The hashTable contains %lu elements", zend_hash_num_elements(&myht)); }
  64. 64. Zend HT common mistakes  A HashTable is not a zval  PHP_FUNCTIONs() mainly manipulate zvals (return_value)  Use, f.e. array_init() to create a zval containing a HT  Access the HT into a zval using Z_ARRVAL() macro types  Lots of HashTable functions return SUCCESS or FAILURE macros, they are not the same as 0 or 1  Very common mistake is writing if (zend_hash_add(...)) { }  HashTables manipulate pointers to your data  If your data is a pointer (zval *), it will then store a zval **, and give you back a zval **  You usualy add to it a pointer address : &my_zval_pointer  A very common error is to forget one '*' or one '&'  If care is not taken, leaks may appear or worse : invalid memory access and dangling pointers  String key length includes the terminating NULL, no "-1" needed
  65. 65. HashTable retrieve API PHP_FUNCTION(foo) { HashTable *myht; zval **data = NULL; if (zend_parse_parameters(ZEND_NUM_ARGS(), "h", &myht) == FAILURE) { return; } if (zend_hash_find(myht, "foo", sizeof("foo"), (void **)&data) == FAILURE) { php_error(E_NOTICE, "Key 'foo' does not exist"); return; } RETVAL_ZVAL(*data, 1, 0); }
  66. 66. HashTable exercise  Create a function that accepts an infinity of temperature values into an array and converts them back to C or F --TEST-- Test temperature converter array <?php $temps = array(68, 77, 78.8); var_dump(multiple_fahrenheit_to_celsius($temps)); ?> --EXPECTF-- array(3) { [0]=> float(20) [1]=> float(25) [2]=> float(26) }
  67. 67. References exercise  Turn multiple_fahrenheit_to_celsius() into an accept-by-reference function --TEST-- Test temperature converter array by-ref <?php $temps = array(68, 77, 78.8); multiple_fahrenheit_to_celsius($temps)); var_dump($temps); ?> --EXPECTF-- array(3) { [0]=> float(20) [1]=> float(25) [2]=> float(26) }
  68. 68. References declaration  Any mismatch in arginfo<->arg passed leads to separation (which is bad as it often dups memory) zval passed to function arginfo decl. zval received in function separated by engine? is_ref=0 refcount = 1 pass_by_ref=0 is_ref=0 refcount = 2 NO is_ref=1 refcount > 1 pass_by_ref=0 is_ref=1 refcount =1 YES is_ref=0 refcount > 1 pass_by_ref=0 is_ref=0 refcount > 1 ++ NO is_ref=0 refcount = 1 pass_by_ref=1 is_ref=1 refcount = 2 YES is_ref=1 refcount > 1 pass_by_ref=1 is_ref=1 refcount > 1 ++ NO is_ref=0 refcount > 1 pass_by_ref=1 is_ref=1 refcount = 2 YES
  69. 69. Constants  Constants are really easy to use into the engine  You usually register yours in MINIT() phase, use CONST_PERSISTENT (if not, const will be cleared at RSHUTDOWN)  You can read any constant with an easy API PHP_MINIT_FUNCTION(extworkshop) { REGISTER_STRING_CONSTANT("fooconst", "foovalue", CONST_CS | CONST_PERSISTENT); return SUCCESS; } zend_module_entry myext_module_entry = { STANDARD_MODULE_HEADER, "extworkshop", extworkshop_functions, /* Function entries */ PHP_MINIT(extworkshop), /* Module init */ NULL, /* Module shutdown */ ...
  70. 70. Reading constants  zend_get_constant() already duplicates the value, no need to do it manually PHP_FUNCTION(foo) { zval result_const, *result_const_ptr = &result_const; if (zend_get_constant("fooconst", sizeof("fooconst") - 1, &result_const)) { RETURN_ZVAL(result_const_ptr, 0, 0); } php_error(E_NOTICE, "Could not find 'fooconst' constant"); }
  71. 71. Practice constants  Create two constants for temperature_converter() $mode argument  TEMP_CONVERTER_TO_CELSIUS  TEMP_CONVERTER_TO_FAHRENHEIT
  72. 72. Customizing phpinfo()  Extensions may provide information to the phpinfo functionnality  PHP_MINFO() used  php_info_*() functions  Beware HTML and non-HTML SAPI zend_module_entry extworkshop_module_entry = { /* ... */ PHP_MINFO(extworkshop), /* ... */ }; PHP_MINFO_FUNCTION(extworkshop) { php_info_print_table_start(); php_info_print_table_header(2, "myext support", "enabled"); }
  73. 73. Module info practice  Customize your extension phpinfo() output
  74. 74. Playing with INI settings
  75. 75. INI general concepts  Each extension may register as many INI settings as it wants  Remember INI entries may change during request lifetime  They store both their original value and their modified (if any) value  They store an access level to declare how their value can be altered (PHP_INI_USER, PHP_INI_SYSTEM, etc...)  PHP's ini_set() modifies the entry value at runtime  PHP's ini_restore() restores the original value as current value  INI entries may be displayed (mainly using phpinfo()), they embed a "displayer" function pointer  INI entries are attached to an extension
  76. 76. An INI entry in PHP struct _zend_ini_entry { int module_number; int modifiable; char *name; uint name_length; ZEND_INI_MH((*on_modify)); void *mh_arg1; void *mh_arg2; void *mh_arg3; char *value; uint value_length; char *orig_value; uint orig_value_length; int orig_modifiable; int modified; void (*displayer)(zend_ini_entry *ini_entry, int type); };
  77. 77. INI entries main  Register at MINIT  Unregister at MSHUTDOWN  Display in phpinfo() (usually)  Many MACROS (once more)  Read the original or the modified value (your choice) in your extension  Create your own modifier/displayer (if needed)
  78. 78. My first INI entry  This declares a zend_ini_entry vector  Register / Unregister it  Display it in phpinfo PHP_INI_BEGIN() PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE, PHP_INI_ALL, NULL) PHP_INI_END() #define PHP_INI_ENTRY(name, default_value, modifiable, on_modify) PHP_MINIT_FUNCTION(myext) { REGISTER_INI_ENTRIES(); ... ... } PHP_MSHUTDOWN_FUNCTION(myext) { UNREGISTER_INI_ENTRIES(); ... ... } PHP_MINFO_FUNCTION(myext) { DISPLAY_INI_ENTRIES(); ... ... }
  79. 79. Using an INI entry  To read your entry, use one of the MACROs  Same way to read the original value : INI_STR(entry); INI_FLT(entry); INI_INT(entry); INIT_BOOL(entry); INI_ORIG_STR(entry); INI_ORIG_FLT(entry); INI_ORIG_INT(entry); INIT_ORIG_BOOL(entry);
  80. 80. Modifying an INI entry  INI entries may be attached a "modifier"  A function pointer used to check the new attached value and to validate it  For example, for bools, users may only provide 1 or 0, nothing else  Many modifiers/validators already exist :  You may create your own modifier/validator OnUpdateBool OnUpdateLong OnUpdateLongGEZero OnUpdateReal OnUpdateString OnUpdateStringUnempty
  81. 81. Using a modifier  The modifier should return FAILURE or SUCCESS  The engine takes care of everything  Access control, error message, writing to the entry... PHP_INI_BEGIN() PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE, PHP_INI_ALL, OnUpdateStringUnempty) PHP_INI_END() ZEND_API ZEND_INI_MH(OnUpdateStringUnempty) { char **p; char *base = (char *) mh_arg2; if (new_value && !new_value[0]) { return FAILURE; } p = (char **) (base+(size_t) mh_arg1); *p = new_value; return SUCCESS; }
  82. 82. Linking INI entry to a global  If you use your entry often by accessing it, you will trigger a hash lookup everytime  This is not nice for performance  Why not have a global of yours change when the INI entry is changed (by the PHP user likely)?  Please, welcome "modifiers linkers"
  83. 83. Linking INI entry to a global  Declare a global struct, and tell the engine which field it must update when your INI entry gets updated typedef struct myglobals { char *my_path; void *some_foo; void *some_bar; } myglobals; static myglobals my_globals; /* should be thread protected */ PHP_INI_BEGIN() STD_PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE, PHP_INI_ALL, OnUpdateStringUnempty, my_path, myglobals, my_globals) PHP_INI_END()
  84. 84. Classes and objects
  85. 85. Classes and objects  More complex than functions as more structures are involved  zend_class_entry  Represents a class  zend_object  Represents an object  zend_object_handle (int)  Represents an object unique identifier to fetch the zend_object back from the object store  zend_object_handlers  Function pointers to specific object actions (lots of them)  zend_object_store  Big global single object repository storing every known object
  86. 86. All starts with a class  A very big structure : zend_class_entry  Lots of macros to help managing classes  Internal classes need to be registered at MINIT()  Internal classes are not destroyed at the end of the request (user classes are)  An interface is a (special) class  A trait is a (special) class  Once a class is registered into the engine, you may create as many objects as you want with low memory footprint
  87. 87. Registering a new class  zend_register_internal_class()  Takes a zend_class_entry* as model  Initialize internal class members  Registers the class into the engine  Returns a new pointer to this freshly added class zend_class_entry *ce_Logger; PHP_MINIT_FUNCTION(myext) { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Logger", NULL); ce_Logger = zend_register_internal_class(&ce TSRMLS_CC); return SUCCESS; }
  88. 88. Registering a new class  Usually the class pointer is shared into a global variable  This one should be exported in a header file  This allows other extensions to use/redefine our class zend_class_entry *ce_Logger; PHP_MINIT_FUNCTION(myext) { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Logger", NULL); ce_Logger = zend_register_internal_class(&ce TSRMLS_CC); return SUCCESS; }
  89. 89. Other class noticeable items  zend_class_entry also manages  Static attributes (zvals)  Constants (zvals)  Functions (object methods and class static methods)  Interfaces (zend_class_entry as well)  Inheritence classes tree  Used traits (zend_class_entry again)  Other stuff such as handlers  We'll see how to take care of such item later on
  90. 90. An example logger class  We will design something like that :  Now : register the Logger class <?php try { $log = new Logger('/tmp/mylog.log'); } catch (LoggerException $e) { printf("Woops, could not create object : %s", $e->getMessage()); } $log->log(Logger::DEBUG, "My debug message");
  91. 91. class constants  zend_declare_class_constant_<type>()  Will use ce->constants_table HashTable #define LOG_INFO 2 PHP_MINIT_FUNCTION(myext) { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Logger", NULL); ce_Logger = zend_register_internal_class(&ce); zend_declare_class_constant_long(ce_Logger, ZEND_STRL("INFO"), LOG_INFO); }
  92. 92. class/object attributes  zend_declare_property_<type>()  Can declare both static and non static attr.  Can declare any visibility, only type matters PHP_MINIT_FUNCTION(myext) { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Logger", NULL); ce_Logger = zend_register_internal_class(&ce); zend_declare_property_string(ce_Logger, ZEND_STRL("file"), "", ZEND_ACC_PROTECTED); }
  93. 93. Practice creating a class  Create the logger class  With 3 constants : INFO, DEBUG, ERROR  With 2 properties  handle : private , null  file : protected , string  You may declare a namespaced class  Use INIT_NS_CLASS_ENTRY for this
  94. 94. Adding methods  Methods are just functions attached to a class  Very common with PHP_FUNCTION ZEND_BEGIN_ARG_INFO(arginfo_logger___construct, 0) ZEND_ARG_INFO(0, value) ZEND_END_ARG_INFO() static zend_function_entry logger_class_functions[] = { PHP_ME( Logger, __construct, arginfo_logger___construct, ZEND_ACC_PUBLIC|ZEND_ACC_CTOR ) PHP_FE_END }; PHP_METHOD( Logger, __construct ) { /* some code here */ } PHP_MINIT_FUNCTION(myext) { zend_class_entry ce; INIT_CLASS_ENTRY(ce, "Logger", logger_class_functions); /* ... */ }
  95. 95. Visibility modifier  One may use  ZEND_ACC_PROTECTED  ZEND_ACC_PUBLIC  ZEND_ACC_PRIVATE  ZEND_ACC_FINAL  ZEND_ACC_ABSTRACT  ZEND_ACC_STATIC  Usually, the other flags (like ZEND_ACC_INTERFACE) are set by the engine when you call proper functions  ZEND_ACC_CTOR/DTOR/CLONE are used by reflection only
  96. 96. Exercise methods  Add  __construct(string $file)  log(long $level, string $message)
  97. 97. Designing and using interfaces  An interface is a zend_class_entry with special flags and abstract methods only  zend_register_internal_interface() is used  It simply sets ZEND_ACC_INTERFACE on the zend_class_entry structure  zend_class_implements() is then used to implement the interface
  98. 98. Practice : add an interface  Detach the log() method into an interface and implement it
  99. 99. Exceptions  Use zend_throw_exception() to throw an Exception  Passing NULL as class_entry will use default Exception  You may want to register and use your own Exceptions  Just create your exception class  Make it extend a base Exception class  Use zend_register_class_entry_ex() for that zend_throw_exception(my_exception_ce, "An error occured", 0); INIT_CLASS_ENTRY(ce_exception, "LoggerException", NULL); ce_Logger_ex = zend_register_internal_class_ex(&ce_exception, zend_exception_get_default(), NULL);
  100. 100. Practice  Write real code for our Logger class  You may need to update properties  zend_update_property_<type>()  You may need to access $this in your methods  use getThis() macro or this_ptr from func args  You may need to use php streams  If so, try getting used to their API by yourself
  101. 101. Playing with globals
  102. 102. Globals ?  They are sometimes needed  Every program needs some kind of global state  Try however to prevent their usage when possible  Use reentrancy instead
  103. 103. PHP globals problem  If the environnement is threaded, many (every?) global access should be protected  Basicaly using some kinf of locking technology  PHP can be compiled with ZendThreadSafety (ZTS) or not (NZTS)  Globals access in ZTS mode need to be mutexed  Globals access in NZTS mode don't need such protection  So accessing globals will differ according to ZTS or not  Thus we'll use macros for such tasks
  104. 104. Declaring globals  Declare the structure (usually in .h)  Declare a variable holding the structure ZEND_BEGIN_MODULE_GLOBALS(extworkshop) char *my_string; ZEND_END_MODULE_GLOBALS(extworkshop) ZEND_DECLARE_MODULE_GLOBALS(extworkshop)
  105. 105. Initializing globals  Most likely, your globals will need to be initialized (most likely, to 0).  There exists two hooks for that  Right before MINIT : GINIT  Right after MSHUTDOWN : GSHUTDOWN  Remember that globals are global (...)  Nothing to do with request lifetime
  106. 106. Globals hook zend_module_entry extworkshop_module_entry = { STANDARD_MODULE_HEADER, "extworkshop", extworkshop_functions, PHP_MINIT(extworkshop), PHP_MSHUTDOWN(extworkshop), PHP_RINIT(extworkshop), PHP_RSHUTDOWN(extworkshop), PHP_MINFO(extworkshop), "0.1", PHP_MODULE_GLOBALS(extworkshop), PHP_GINIT(extworkshop), PHP_GSHUTDOWN(extworkshop), NULL, STANDARD_MODULE_PROPERTIES_EX };
  107. 107. GINIT() PHP_GINIT_FUNCTION(extworkshop) { memset(extworkshop_globals, 0, sizeof(*extworkshop_globals)); } void zm_globals_ctor_extworkshop(zend_extworkshop_globals *extworkshop_globals )
  108. 108. Accessing globals  A macro is present in your .h for that  YOUR-EXTENSION-NAME_G(global_value_to_fetch) #ifdef ZTS #define EXTWORKSHOP_G(v) TSRMG(extworkshop_globals_id, zend_extworkshop_globals *, v) #else #define EXTWORKSHOP_G(v) (extworkshop_globals.v) #endif if (EXTWORKSHOP_G(my_string)) { ... }
  109. 109. Globals : practice  Move our resource_id to a protected global  Our extension is now Thread Safe !
  110. 110. Changing the engine behavior
  111. 111. Danger Zone  Here come weird crashes
  112. 112. Changing the engine behavior  There are many ways to change what exists  As soon as elements are pointers  They are usually fetchable from a global table  Many global tables in PHP  EG() and CG() are the most commonly used  CG() compiler_globals and EG() executor globals :  global function table  global class table  global/local variables table  ...
  113. 113. A quick tour on the engine
  114. 114. Zend Virtual Machine Zend VM Compiler Zend VM Executor Lexer Parser PHP Script bytes Tokens Compiler N odes OPCodes Result bytes
  115. 115. The BIG steps  The engine compiles a PHP file  This gives an OPArray as result  The engine launches the OPArray into the executor  The executor executes each instruction  If a user function is called  The engine executes an OPCode to launch the func  The user func has been compiled as an OPArray  The engine creates a new stack frame with this function's OPArray  The engine now executes the new frame  When finished, the engine switches back to early frame
  116. 116. Zend Virtual Machine  The Zend VM Compiler is not hookable  You cant change PHP syntax from an extension  The Zend VM Executor is hookable  You can alter the instructions beeing run  You can alter the path the executor runs through  Zend ext have more power than PHP ext here  We will only talk about Zend VM executor in the future
  117. 117. OPCodes  Introducing OPCodes :  In computer science, an opcode (operation code) is the portion of a machine language instruction that specifies the operation to be performed. (Wikipedia)  Opcodes can also be found in so called byte codes and other representations intended for a software interpreter rather than a hardware device. (Wikipedia)
  118. 118. zend_op structure
  119. 119. OPCode example <?php const DEF = "default"; if (isset($argv[1])) { $b = (array)$argv[1]; } else { $b = DEF; } var_dump($b);
  120. 120. OPCode example <?php const DEF = "default"; if (isset($argv[1])) { $b = (array)$argv[1]; } else { $b = DEF; } var_dump($b);
  121. 121. OPCodes An OPCode is one VM instruction designed to do just one little thing  It can use up to two operands  It defines an operation to do with them  It produces one result (which can be NULL)  It may make use of an additionnal "extended_value"  Think about a calculator  Two operands, one operation, one result, one carry
  122. 122. Zend VM OPCodes  Open Zend/zend_vm_opcodes.h  Number of OPCodes :  PHP 5.3 : 134  PHP 5.4 : 139  PHP 5.5 : 144  PHP 5.6 : 148 #define ZEND_NOP 0 #define ZEND_ADD 1 #define ZEND_SUB 2 #define ZEND_MUL 3 #define ZEND_DIV 4 #define ZEND_MOD 5 #define ZEND_SL 6 #define ZEND_SR 7 ...
  123. 123. OPArray  An OPArray is a structure holding a series of OPCodes to be run  (and many more things)  Each PHP file is parsed and results into an OPArray  Each PHP user function is parsed, and results into an OPArray  When a user function is called, the engine pushes its OPArray onto the executor for it to be run
  124. 124. Zend VM Executor loop  The executor is a giant loop that is given an OPArray, extracts its OPCodes, and run them one after the other, infinitely  Fortunately, the compiler always generates a ZEND_RETURN OPCode at the end, instructing the executor loop to return
  125. 125. Zend VM Executor loop ... execute_data = i_create_execute_data_from_op_array(EG(active_op_array), 1 TSRMLS_CC); while (1) { int ret; if ((ret = execute_data->opline->handler(execute_data TSRMLS_CC)) > 0) { switch (ret) { case 1: EG(in_execution) = original_in_execution; return; case 2: goto zend_vm_enter; break; case 3: execute_data = EG(current_execute_data); break; default: break; } } } zend_error_noreturn(E_ERROR, "Arrived at end of main loop which shouldn't happen");
  126. 126. Zend VM Executor loop  Every OPCode is responsible of stepping forward the OPLine  OPLine is a pointer to the current OPCode beeing run into the current OPArray present into the VM. static int ZEND_FASTCALL ZEND_ADD_HANDLER(ZEND_OPCODE_HANDLER_ARGS) { zend_op *opline = EX(opline); fast_add_function(&EX_T(opline->result.var).tmp_var, opline->op1.zv, opline->op2.zv TSRMLS_CC); CHECK_EXCEPTION(); ZEND_VM_NEXT_OPCODE(); } #define ZEND_VM_NEXT_OPCODE() CHECK_SYMBOL_TABLES() ZEND_VM_INC_OPCODE(); ZEND_VM_CONTINUE() #define ZEND_VM_INC_OPCODE() execute_data->opline++ #define ZEND_VM_CONTINUE() return 0
  127. 127. Zend VM Executor loop  Some OPCodes however can alter the execution flow, for example ZEND_JMP, generated in case of conditionnal jumps (if, while, try, switch ,etc.) static int ZEND_FASTCALL ZEND_JMPZ_HANDLER(ZEND_OPCODE_HANDLER_ARGS) { zend_op *opline = EX(opline); zval *val = opline->op1.zv; int ret; ret = i_zend_is_true(val); if (UNEXPECTED(EG(exception) != NULL)) { HANDLE_EXCEPTION(); } if (!ret) { ZEND_VM_SET_OPCODE(opline->op2.jmp_addr); ZEND_VM_CONTINUE(); } ZEND_VM_NEXT_OPCODE(); }
  128. 128. Zend VM Executor example  Let's play a demo together, stepping into the executor  Let's use a very simple code <?php const SEPARATOR = "_"; if (isset($argv[1])) { echo $argv[1] . SEPARATOR . "bar"; } else { echo "I'm stuck :p"; }
  129. 129. Changing the engine behavior  You may overwrite the main executor routines  zend_execute() / zend_execute_internal() static void (*old_zend_execute)(zend_op_array *op_array TSRMLS_DC); static void (*old_zend_execute_internal)(zend_execute_data *execute_data_ptr, int return_value_used TSRMLS_DC); static void fooext_zend_execute(zend_op_array *op_array TSRMLS_DC) { do_some_stuff(op_array); old_zend_execute(op_array TSRMLS_CC); do_some_more_stuff(op_array); } PHP_RINIT(fooext) { old_zend_execute = zend_execute_fn; old_zend_execute_internal = zend_execute_internal; zend_execute = fooext_zend_execute; zend_execute_internal = fooext_zend_execute_internal; }
  130. 130. Changing an opcode handler  Since 5.1, PHP extensions may change opcode handlers individually  With no/little performance penalty  You'll need to understand the original handler before overwritting it  Changing opcode handlers changes PHP's scripting engine behavior (executor)  Warning : many changes in here through PHP versions (especially 5.3 - 5.4 - 5.5)
  131. 131. Logging each object creation  Let's change ZEND_NEW handler  let's log each object creation  Let's use one instance of our "Logger" for that  Strategy :  Declare our handler  Warning : handlers should only be declared BEFORE compilation pass, aka, in RINIT  execute our handler  execute original zend executor's handler  Restore our handler  Warning: handlers should be restored when the executor is shut down, this can only be achieved in post_deactivate() handler
  132. 132. Need help ?  Read each others source code / extension  http://pecl.php.net  http://github.com  http://lxr.php.net  Find out this workshop source code  https://github.com/jpauli/PHP_Extension_Workshop  Read http://www.phpinternalsbook.com  Ask for help on irc #php.pecl  Open the source and study it by yourself  Attend conferences about internals
  133. 133. Thank you

×