Hello everybody
Julien PAULI
Programming with PHP since early 2000s
Programming in C
PHP Internals programmer/reviewer
PHP 5.5 & 5.6 Release Manager
@julienpauli
Tech blog at jpauli.github.io
jpauli@php.net
What about you ?
Got some Unix/Linux understandings ?
Have already experienced C programming ?
Have already experienced PHP programming ?
What we'll cover together
Memory , what's that ?
bytes, stack, heap, etc.
Measuring a process memory consumption
memory image map analysis
Understanding PHP memory consumption
Zend Memory Manager coming
Measuring PHP memory consumption
from PHP land or from system land
Memory in a user process
The Virtual Memory image is divided in segments
text
data
heap
stack
Memory usage can grow
Stack will grow as functions will get called
And will narrow as the calls stop and return
Heap will grow as the programmer will decide
Using dynamic allocation functions (malloc, mmap)
Programmer has to free memory by hand
If not : memory leak
Linux memory monitoring
'top' or /proc FS
> cat /proc/28754/status
VmPeak: 20452 kB
VmSize: 20324 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 316 kB
VmRSS: 316 kB
VmData: 16440 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 1664 kB
VmPTE: 28 kB
VmSwap: 0 kB
Size of the VM map (out of total mem)
Resident Set Size : Size actually in PM
Size of the data segment in VM
Size of the stack segment in VM
Size of the text segment in VM
pid
Going even deeper
Let's show the detailed process memory map :
> cat /proc/28754/smaps
shared segment
private mem
shared mem
PHP, just a process
PHP is just a process like any other
You then can monitor its memory usage by asking
the OS
<?php
passthru(sprintf('cat /proc/%d/status', getmypid()));
<?php
function heap() {
return shell_exec(sprintf('grep "VmRSS:" /proc/%d/status', getmypid()));
}
The PHP model
One same process may treat many requests
If the process leaks memory, you'll suffer from that
Request n+1 must know nothing about request n
Need to "flush" the request-allocated memory
Need to track request-bound memory claims
ZendMM is the layer that does the job
Share-nothing architecture : by design.
Why a custom layer ?
ZendMM
Allows monitoring of request-bound heap usage by
basically counting malloc/free calls
Allows the PHP user to limit the heap mem usage
Allows caching of alloced blocks to prevent memory
fragmentation and syscalls
Allows preallocating blocks of well-known-sizes for
PHP internal structures to fit-in in an aligned way
Ease memory leaks debugging in core and exts
ZendMM guards and leak tracker
Only works in debug mode PHP
report_memleaks=1 in php.ini
Does NOT replace valgrind
Zend MM test script
<?php
ini_set('memory_limit', -1); /* unlimited ZendMM heap */
function heap() {
return shell_exec(sprintf('grep "VmRSS:" /proc/%d/status', getmypid()));
}
echo heap();
$a = range(1, 1024*1024); /* Stress memory by allocating */
echo heap();
unset($a); /* Stress memory by freeing */
echo heap();
Zend MM launch test
> time USE_ZEND_ALLOC=1 php zendmm.php
VmRSS: 9640 kB
VmRSS: 159296 kB
VmRSS: 10068 kB
real 0m0.237s
user 0m0.148s
sys 0m0.080s
> time USE_ZEND_ALLOC=0 php zendmm.php
VmRSS: 9608 kB
VmRSS: 148988 kB
VmRSS: 140804 kB
real 0m0.288s
user 0m0.176s
sys 0m0.108s
Valgrind tests for Symfony2 command
app/console runs lots of PHP code under SF2
>USE_ZEND_ALLOC=1 valgrind php app/console debug:container
total heap usage: 84,000 allocs, 84,000 frees, 25,966,154 bytes allocated
>USE_ZEND_ALLOC=0 valgrind php app/console debug:container
total heap usage: 813,570 allocs, 813,570 frees, 74,579,732 bytes allocated
ZendMM benefits
Better memory management
More memory efficient
Far less malloc/free calls
Less context switches, less Kernel stress
Less CPU usage
Less heap fragmentation / compaction
A PHP ~10% faster with ZendMM enabled
Really depends on use-case
Memory manager internals
Layer on top of the heap
Will allocate memory from the heap by chunks of
customizable size (segments)
Will use a customizable low level heap (malloc /
mmap)
A quick word on ZendMM internals
ZEND_MM_SEG_SIZE env variable to customize
segment size
Must be power-of-two aligned
Default value is 256Kb
ZendMM in PHP user land
memory_limit (INI setting)
memory_get_usage(true)
Returns the size of all the allocated segments
memory_get_usage()
Returns the occupied size in all the allocated
segments
memory_get_[peak]_usage([real])
Returns the max memory that has been
allocated/used. Could have been freed meantime
memory_get_usage() ?
memory_get_usage() tells you how much your
allocated blocs consume
They usually don't fill their segment entirely
Thus memory_get_usage(true) shows more
This doesn't count stack
This does only count request-bound memory
This doesn't count linked libraries present in the
process memory map
This doesn't show non-request-bound memory
Recommendations / statements
Understand memory_get_usage()
It only shows request-bound allocations, not
persistent allocations (that reside through requests)
PHP extensions may allocate persistent memory
Do NOT activate extensions you will not use
Libraries used by PHP may also allocate persistent
memory
Use your OS to monitor your process memory
accurately
Master your PHP mem usage
In PHP land ...
all variable types consume memory
every script asked for compilation will eat memory
This memory will be allocated using ZendMM
The memory for parsed script is freed when the
request ends
The memory for user variable is freed when the
data is not used any more
And here comes the challenge
When isn't the data needed any more ??
Compilation eats memory
Compiling a script eats request-bound memory
If you compile a class, that eats lots of memory
You'd better use that class at runtime
Use an autoloader to be sure
<?php
$mem = memory_get_usage();
require __DIR__ . '/../vendor/autoload.php';
require __DIR__ . '/../src/Symfony/Component/DependencyInjection/ContainerBuilder.php'
echo memory_get_usage() - $mem . "n";
php app/bar.php
246,692
PHP Variables
What eats memory is what is stored into the zval,
not really the zval itself :
A huge string
Tip : a file_get_contents('huge-file') is a huge string
A complex array or object
Resources don't really consume mem in zval
<?php
/* This consumes sizeof(zval) + 1024 bytes */
$a = str_repeat('a', 1024);
PHP Variables
What you want to avoid is have PHP duplicate the
zval
But PHP is kind about that
What you want to happen in PHP freeing the
memory ASAP
Should you know when PHP duplicates or frees zval
, that's the most important !
zvals and refcount
PHP simply counts how many symbols (PHP
vars) point to a zval
This is called the refcount
<?php
$a = "foo";
$b = $a;
zvals and refcount
PHP uses a CopyOnWrite (COW) system for
zvals
Memory is saved
Memory gets allocated only on changes
<?php
$a = "foo";
$b = $a;
$a = 17;
zvals and refcount
PHP frees memory for a zval
when its refcount reaches 0
Yes, unset() just refcount-- ,
that's all
<?php
$a = "foo";
$b = $a;
$c = $b;
$b = "bar";
unset($a);
No references needed
You see how smart PHP is with memory ?
It's been designed with that in mind
No references needed to hint PHP !
Don't try to hint PHP with references
References can lead to adverse effects
Force PHP to copy a zval
Prevents PHP from freeing memory of a zval
&
Tracking refcount
xdebug_debug_zval()
symfony_zval_info()
namespace Foo;
class A
{
public $var = 'varA';
}
$a = new A();
xdebug_debug_zval('a');
a: (refcount=1, is_ref=0)=class FooA { public $var =
(refcount=2, is_ref=0)='varA'; }
Tracking refcount
namespace Foo;
class C
{
public $b;
public function __construct(B $b) {
$this->b = $b;
}
}
$c = new C($b = new B);
xdebug_debug_zval('c');
c: (refcount=1, is_ref=0)=class FooC { public $b = (refcount=2, is_ref=0)=class FooB { } }
namespace Foo;
class B
{
}
unset($b);
xdebug_debug_zval('c');
c: (refcount=1, is_ref=0)=class FooC { }
c: (refcount=1, is_ref=0)=class FooC { public $b = (refcount=1, is_ref=0)=class FooB { } }
unset($c->b);
xdebug_debug_zval('c');
Garbage collector ?
As of PHP5.3 , a garbage collector exists
Used to free circular references
And that's all !
PHP already frees itself your vars as their
refcount reaches 0
And it's always been like that
Circular references
$a = new A;
$b = new B
(object) 'A'
refcount = 1
$a (object) 'B'
refcount = 1
$b
Circular references
Objects are still in memory but no more PHP var
point to them
We can call that a "PHP Userland memory leak"
unset($a, $b);
(object) 'A'
refcount = 1
$b->a (object) 'B'
refcount = 1
$a->b
PHP references main line
Using references (&) in PHP can really fool you
They usually force the engine to duplicate memory
containers
Which is bad for performance
Especially when the var container is huge
References mismatch
<?php
function foo($data)
{
echo "in function : " . memory_get_usage() . "n";
}
echo "Initial memory : " . memory_get_usage() . "n";
$r = range(1, 1024);
echo "Array created : " . memory_get_usage() . "n";
foo($r);
echo "End of function " . memory_get_usage() . "n";
Initial memory : 227.136
Array created : 374.912
in function : 374.944
End of function 374.944
References mismatch
<?php
function foo($data)
{
echo "in function : " . memory_get_usage() . "n";
}
echo "Initial memory : " . memory_get_usage() . "n";
$r = range(1, 1024);
$r2 = &$r;
echo "Array created : " . memory_get_usage() . "n";
foo($r);
echo "End of function " . memory_get_usage() . "n";
Initial memory : 227.208
Array created : 375.096
in function : 473.584
End of function 375.128
When does the engine separate ?
In any mismatch case :
zval passed to function arginfo decl. zval received in function separated by engine?
is_ref=0
refcount = 1
pass_by_ref=0 is_ref=0
refcount = 2
NO
is_ref=1
refcount > 1
pass_by_ref=0 is_ref=1
refcount =1
YES
is_ref=0
refcount > 1
pass_by_ref=0 is_ref=0
refcount > 1 ++
NO
is_ref=0
refcount = 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
is_ref=1
refcount > 1
pass_by_ref=1 is_ref=1
refcount > 1 ++
NO
is_ref=0
refcount > 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
Symfony_debug to notice mismatches
https://github.com/symfony/debug
function bar($var) { }
$a = "foo";
$b = &$a;
bar($a);
Notice: Separating zval for call to function 'bar' in /tmp/memory.php on line 20
Foreach separation behavior
It happens sometimes foreach() duplicates your
variable for iteration
This will eat performances in case of big or
complex arrays being iterated
There is nothing special to say about objects
implementing Traversable
The behavior is then just yours
foreach iterating array #1
If the variable has a refcount of 1, no duplication is
performed by foreach()
$a = range(1,1024);
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
374056
374056
foreach iterating array #2
If the variable has a refcount >1, foreach() will
duplicate it fully for iteration
$a = range(1,1024);
$b = $a;
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
472512
374056
foreach iterating array #3
If the variable is a reference, foreach() will work
onto that array and won't perform duplication
$a = range(1,1024);
$b = &$a;
echo memory_get_usage() . "n" ;
foreach ($a as $v) {
if ($v == 1) {
echo memory_get_usage() . "n" ;
}
}
echo memory_get_usage() . "n" ;
373936
374056
374056
Monitoring memory usage
Not much tools exist (for PHP)
memory_get_usage()
OS' help (/proc , pmap , etc...)
Valgrind with massif tool
PHP Extensions
Xdebug
memprof
memtrack
Quick intro to memprof
https://github.com/arnaud-lb/php-memory-profiler
$b = range(1, 1024 * 1024); /* a lot of memory */
$b[] = foo();
loader('/Zend/Date.php'); /* a lot of PHP source code */
memprof_dump_callgrind(fopen('/tmp/trace.out', 'w'));