Hi there,
I recently updated from VDR 1.4.6 to 1.5.2. (with no problems so far, by the way :)) But today I noticed something which seems to me like a small bug in vdr's shutdown routines. My shutdown script checks for different running processes (or connections) and does not quit vdr if at least one of the tests returns true. But since 1.5.2 (or probably since 1.5.0) the shutdown script remains as a zombie in such cases until vdr actually quits.
I haven't checked the sources yet, but I guess there's somewhere a call to close() (or pclose()) for the opened pipe(s) missing.
Chris wrote:
The shutdown script has always been a fire-and-forget script that did not evaluate any return codes or output. The only change that landed together with the VDR shutdown rewrite in 1.5.1 is that the script is now called detached from VDR, so it can run in parallel and will not terminate together with VDR. Some shutdown scripts may need to be adapted.
If you want to investigate: thread.c:SystemExec() is called with the new parameter Detached=true. STDIN is redirected to /dev/null, STDOUT and STDERR are same as VDR, all other FD's get closed as before.
Cheers,
Udo
Monday, May 28, 2007, 9:46:12 PM, Udo wrote:
You can reproduce the problem pretty easy:
If I use an 'empty' shutdown script that just starts with "#!/bin/sh" or bash or whatever, the script itself becomes a zombie. And if I use a script with just an empty line, "sh" becomes the zombie. So in either way, something is wrong and it can not be the script's fault.
It would be helpful if someone could test this behavior.
My assumption is that the problem occurs because of the missing wait call (if SystemExec is called 'detached'). I know, if VDR would wait in there, the script wouldn't run simultaneously. But if VDR never waits for the child's PID, the child's termination never gets handled and imho that's why the script remains as a zombie.
waitpid( -1, &dummy, WNOHANG) called at some place later should do the trick. Or waitpid() explicitly for the child's PID, if we want to store the PID anywhere.
-- Chris
Chris vdr@p-lost.franken.de wrote:
i once found this code somewhere and since use this:
int System(const string &cmd) { // The parent process forks and then waits right there for the child // to terminate. The child process then forks again, giving us achild // and a grandchild. The child exits immediately (and hence the parent // waiting for it notices its death and continues to work). Now the // grandchild does whatever the child was originally supposed to do. // Since its parent died, it is inherited by init, which will do // whatever waiting is needed.
switch (fork()) { case 0: if (!fork()) system(cmd.c_str()); _exit(0); break; case -1: break; default: wait(NULL); }
return (0); }
i didn't look at vdr's SystemExec, but maybe this code snipet comes in handy.
best regards ... clemens
Chris wrote:
I've checked this again, and you really have to care about processes even in their own process group. After some research, I think the double-fork trick is the best way to fix this: Make the script a grandchild of VDR and let the intermediate child do an instant wait-friendly exit. This makes the script an orphan that does not get a zombie any more.
A patch is attached.
If the script may run longer than VDR, then VDR cannot wait for it. Thats the problem.
Cheers,
Udo
--- thread.c.bak 2007-05-29 21:07:27.000000000 +0200 +++ thread.c 2007-05-29 21:12:56.000000000 +0200 @@ -12,6 +12,7 @@ #include <linux/unistd.h> #include <malloc.h> #include <stdarg.h> +#include <stdlib.h> #include <sys/resource.h> #include <sys/syscall.h> #include <sys/time.h> @@ -507,7 +508,7 @@
if (pid > 0) { // parent process int status = 0; - if (!Detached && waitpid(pid, &status, 0) < 0) { + if (waitpid(pid, &status, 0) < 0) { LOG_ERROR; return -1; } @@ -515,6 +516,9 @@ } else { // child process if (Detached) { + // Fork again and let first child die + // Grandchild stays alive without parent + if (fork() > 0) exit(0); // Start a new session pid_t sid = setsid(); if (sid < 0)