[PATCH] forstdin: Fix possible error/hanging w/ -p

From: Olivier Brunel <jjk_at_jjacky.com>
Date: Tue, 9 Jun 2015 20:17:00 +0200

Basically there's a race condition with signals, and it goes something
like this:
- we launches child1, and add pid1 to the array of pids
- we launches child2, when suddenly!
- SIGCHLD happens: in the handler, we wait() for pid1, removing it from
  pids, then wait for pid2, ignoring it since it's not (yet) in the array,
  wait again and be done
- back in the normal path, we now add pid2 to pids
- later on, we believe there's still a reason to wait (pid2) when in fact,
  there's not. Leading to e.g. ECHILD in waitn()

(Noting that this wasn't limited to a single extra pid in the end.)

Blocking SIGCHLD until we've added the pid to the array ensures this
doesn't happen. (Thus we bring back the call to waitn removed in 5053ea39,
which didn't really fix the issue.)

Signed-off-by: Olivier Brunel <jjk_at_jjacky.com>
Hey Laurent,
So I have been hit by a bug in forstdin for some time, and finally took some
(time that is) to look into it. Then saw you (tried to) address it recently (in
5053ea39) as I pulled to maybe rebase my fix... However, I'm not sure your fix
really does the trick, plus (as you've seen by now) I have an explanation for
the bug, so there.
This bug would manifest in two ways for me, either forstdin would exit 111 (w/
ECHILD in waitn()), or it would just hang indefinitely. I believe this was two
manifestations of the same issue, and due to some compilation options or
something I'd either get one or the other. I never really looked into it more,
honestly, and the one I faced working on it was the exit 111 (I got the other
from my package compilation, where things are compiled w/ different options
Anyhow, I'm not sure about your fix with sig_pause(), but I did a quick try and
got it to hang indefinitely I'm afraid. So here's a proper fix (I hope), also
restoring the waitn() call.
PS: If I may, little tip/advice re: git: this is why it's better to separate
things when you commit. That is, if you'd made a commit to address this bug
only, one for the bugfix for forbacktickx and another one to prepare for the
rc, it might be better/make things easier.
That is, one commit only means/relates to one thing, so when mentioning it it's
clearer what issue we're dealing with. Also it's easier when using blame later
on, or to e.g. revert it, as might have been done here.
 src/execline/forstdin.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/src/execline/forstdin.c b/src/execline/forstdin.c
index 3fb52eb..b620c89 100644
--- a/src/execline/forstdin.c
+++ b/src/execline/forstdin.c
_at_@ -128,12 +128,14 @@ int main (int argc, char const **argv, char const *const *envp)
       if (!stralloc_0(&modif)) strerr_diefu1sys(111, "stralloc_0") ;
       if (!env_merge(newenv, envlen+2, envp, envlen, modif.s, modif.len))
         strerr_diefu1sys(111, "merge environment") ;
+      if (pids.s) sig_block(SIGCHLD) ;
       pid = el_spawn0(argv[1], argv + 1, newenv) ;
       if (!pid) strerr_diefu2sys(111, "spawn ", argv[1]) ;
       if (pids.s)
         if (!genalloc_append(pid_t, &pids, &pid))
           strerr_diefu1sys(111, "genalloc_append") ;
+        sig_unblock(SIGCHLD) ;
_at_@ -147,11 +149,11 @@ int main (int argc, char const **argv, char const *const *envp)
     stralloc_free(&modif) ;
   if (pids.s)
-    for (;;)
-    {
-      sig_block(SIGCHLD) ;
-      if (!pids.len) break ;
-      sig_pause() ;
-    }
+  {
+    if (sig_restore(SIGCHLD) < 0)
+      strerr_diefu2sys(111, "restore", " SIGCHLD handler") ;
+    if (!waitn(genalloc_s(pid_t, &pids), genalloc_len(pid_t, &pids)))
+      strerr_diefu1sys(111, "waitn") ;
+  }
   return 0 ;
Received on Tue Jun 09 2015 - 18:17:00 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:38:49 UTC