notification-fd inadvertently closed by selfpipe_finish on FreeBSD?

From: Saj Goonatilleke <saj_at_discourse.org>
Date: Sun, 18 Aug 2024 03:38:09 +1000

Hello,

Something went wrong with s6-log after a reboot today. :)

    s6-log: fatal: invalid notification fd: Bad file descriptor

The loggers in my supervision trees are all s6-log -d 6 ...
with a matching number in notification-fd. (Six is a nice number, yes?)

s6 was installed from packages on a machine running FreeBSD.
I follow the quarterly branch of the FreeBSD ports tree.
About one month ago, that ports tree went from:

    skalibs 2.13.1.1
    execline 2.9.3.0
    s6 2.11.3.2

to:

    skalibs 2.14.1.1
    execline 2.9.5.1
    s6 2.12.0.4

This was my first reboot after that bump.
As a quick bodge, I rolled back and rebooted again.
Then I tried to work out what had caused the upset...

The problem reproduces with -d 6 but not -d 42. High fds are fine.

    970bbeb Defork s6-supervise (!)

This commit drew my attention. Previously, the call to selfpipe_finish()
was sequenced before the call to fd_move(notif, notifyp[1]).
I built everything from their respective trunk tips and traced:
sure enough, the read-side of the selfpipe happened to be at fd 6.
cspawn_child_exec() now executes selfpipe_finish() very late,
after it calls fd_move() on the notification-fd.
The selfpipe fd gets blatted by the noti-fd,
then the noti-fd is closed by accident.

Is this a bug, or am I holding it wrong?

I also use -d 6 on macOS but am yet to encounter the problem there.
(I swept a bunch of fds to no avail. Does not repro on macOS.)

configure on FreeBSD:

    Checking whether system has POSIX_SPAWN_SETSID...
      ... no
    Checking whether system has POSIX_SPAWN_SETSID_NP...
      ... no

configure on macOS:

    Checking whether system has POSIX_SPAWN_SETSID...
      ... yes
    Checking whether system has POSIX_SPAWN_SETSID_NP...
      ... no

So maybe skalibs' cspawn...
...calls cspawn_fork() on FreeBSD if CSPAWN_FLAGS_SETSID is set.
...calls cspawn_pspawn() on macOS under the same conditions.

There is no call to selfpipe_finish() in cspawn_pspawn.
(I guess because of posix_spawnattr_setsigmask?)

Thank you for reading -- and for s6.
Received on Sat Aug 17 2024 - 19:38:09 CEST

This archive was generated by hypermail 2.4.0 : Sat Aug 17 2024 - 19:38:45 CEST