PHP's Resources and garbage collection
Tagged with: [ deamon ] [ garbage collection ] [ PHP ] [ strace ]
Today, I’ve found a nice bug/feature/whatsmathing in PHP. I was playing around with writing a daemon and if you have any experience writing daemons (in any language), there are a few rules you have to live by. For instance, setting your effective uid and gid to a non-privileged user (in case you needed to do some privileged initialization, like opening a socket on a tcp port < 1024), setting the process as a group leader with posix_setsid(), and redirecting stdio file descriptions. And here something went wrong which took a while to find and fix..
So, what’s the case? I’ve written a simple proof of concept for a bit of code that I wanted to use as a daemon. There
are multiple ways to do this, ranging from bad to very bad: like placing it into crontab, just adding a & when starting
the app, and many other strange and not-very secure/effective ways. Because it’s a PoC and not very OO’ish, I’ve decided
just to create a "daemonize()"
function which gets called before the “main loop” so that it will be running in a nicely
daemon stowed away in the background. If i wanted to do some debugging, I only have to remove the daemonize()
function,
and the system will run in the foreground. Easy-peasy.
The gist of the daemonize()
function looked something like this:
But whatever I tried, when I run the application it just exits without leaving a daemon running. After some
xdebug/commentingout debugging, i’ve found the issue had to do with the fclose()
and fopen()
lines.
Why open and close in the first place?
Suppose you have a deamon that reads something from stdin
? Either it can wait for a keypress, or something else. By
redirecting the stdin
to /dev/null
, your application automatically will receive a EOF
upon read so it will not
wait indefinitely on something to read. The same thing with stdout
: you can still write something to stdout
without any
errors but it doesn’t end up anywhere.
Normally, this is done with redirecting the stdio handles (file descriptor 0, 1 and 2) to /dev/null
, but such a system
is not present in PHP. The next best thing we can do, is actually closing the stdio descriptors and opening new ones
directly. Whenever PHP (or actually, the OS) detects that either one of the first 3 file descriptors is closed, it will
automatically use that file descriptor during the next fopen()
call you make. This means that if you close stdio, you
MUST open it again straight away, otherwise bad things will happen.
Debugging our script, chuck norris style
After debugging for a while, I tried to see what would happen internally by using strace
. This tool allows you to see
what happens under the hood by seeing what kind of call are getting made to the operating system. If you know how to
interpret its output, you can save literally hours of debugging:
$ strace -ff php daemon.php ....SNIP.... [pid 20334] munmap(0x7fe51b865000, 2201520) = 0 [pid 20334] munmap(0x7fe51d7a7000, 2293680) = 0 [pid 20336] set_robust_list(0x7fe527c8eac0, 0x18) = 0 [pid 20336] setsid() = 20336 [pid 20336] close(0) = 0 [pid 20336] munmap(0x7fe527aeb000, 4096) = 0 [pid 20336] close(1) = 0 [pid 20336] munmap(0x7fe527aea000, 4096) = 0 [pid 20336] close(2) = 0 [pid 20336] lstat("/dev/null", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 [pid 20336] lstat("/dev", {st_mode=S_IFDIR|0755, st_size=3700, ...}) = 0 [pid 20336] open("/dev/null", O_RDONLY) = 0 [pid 20336] fstat(0, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 [pid 20336] lseek(0, 0, SEEK_CUR) = 0 [pid 20336] open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1 [pid 20336] fstat(1, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 [pid 20336] lseek(1, 0, SEEK_CUR) = 0 [pid 20336] open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 2 [pid 20336] fstat(2, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 [pid 20336] lseek(2, 0, SEEK_CUR) = 0 [pid 20336] close(0) = 0 [pid 20336] close(1) = 0 [pid 20336] close(2) = 0 [pid 20336] write(1, ".", 1) = -1 EBADF (Bad file descriptor) [pid 20336] write(3, "173\0<?xml version=\"1.0\" encoding"..., 178) = -1 EPIPE (Broken pipe) [pid 20336] --- SIGPIPE (Broken pipe) @ 0 (0) --- [pid 20336] recvfrom(3, "", 128, 0, NULL, NULL) = 0 [pid 20336] close(3) = 0 [pid 20336] munmap(0x7fe527aa9000, 266240) = 0 [pid 20336] munmap(0x7fe517821000, 2190464) = 0 [pid 20336] munmap(0x7fe518302000, 2129368) = 0
This is strange. After the setsid()
call, we see that the next lines are actually doing a close(0)
, close(1)
and
close(2)
. those are respectively closing stdin
, stdout
and stderr
. So this part is what actually gets called
when you issue a fclose()
in PHP. The next lines look pretty familiar as well. It does some stats on /dev/null
, and
opens that file on the "open("/dev/null", ..."
line. The number at the end of the line, is the actual file descriptor
for that file, so you see that it allocates respectively file descriptor 0, 1 and 2. Everything seems to be working!
However, a little bit we see AGAIN a close(0)
, close(1)
and close(2)
. The next line after that is just some
debugging (printing a single dot), but you see that results in -1
because it tries to write to STDOUT
, which was
just closed.
Our issue is thus with the extra close()
calls that gets made. Where did they come from? Is it something to do with
fork()
? Some PHP magic? Something else?
PHP garbage collection
When you allocate a variable in PHP, internally it will hold a special counter for that variable called a reference
count. It’s a simple counter that keeps track on how many times that variable is used. For instance if you instantiate a
class and assign in to a variable $foo
, the reference count for that class will become 1. If you do: "$bar = $foo"
, it
means that both the variables $bar and $foo will reference your class and its reference-count becomes 2. When we do $foo
= 1;
afterwards, PHP sees that $foo
doesn’t reference to your class anymore, and decreases the reference count again.
This is extremely handy to figure out quickly if PHPs data like variables, or big classes etc, is being referenced or not. If it doesn’t have any references, PHP can actually free the memory from that variable. This way it can keep your memory usage as low as possible and you this way you are able to use lots of variables throughout your application without using tons of memory. The process of freeing up memory when no references are found is called garbage collection (GC).
But what has got GC to do with our bug?
The reason is due to the fact that we are using a function call daemonize()
, in which we do our fclose()
and
fopen()
. We actually assign the file descriptors from fopen()
to our $stdin
, $stdout
and $stderr
variables.
But these variables are local to the function. As soon as the function ends, PHP will detect that these variables aren’t
used anymore and cleans them up, because there are no reference anymore. This means that for resources, it automatically
closes these resources. This is why we get the extra close()
calls: this is PHP just cleaning up.
So, now we know the issue, and we can actually fix it. There is only one way to fix this and that is to make sure that we always keep a reference to these resources so they don’t get garbage collected. Because we are inside a function, we should use globals variables so the variable will still exist after we exit the function. In our case, we can fix it like this:
How about globals begin evil huh? Because we now use global variables, PHP will leave the resources alone and doesn’t
close them. But make sure we don’t do something like $STDIN = "foo";
, because in will decrease the reference count of
the $STDIN
variable back to 0, and the resource will be cleanup and closed again.
Obviously, you don’t have this issue when you fclose()
and fopen()
outside any function. This is because you are
already in the global space, and there is nothing to exit from (apart from exiting the application).