Re: Secondary child process support in s6? from Laurent Bercot on 2025-10-09 (skaware)

From: Laurent Bercot <ska-skaware_at_skarnet.org>
Date: Wed, 08 Oct 2025 22:06:34 +0000

>Would it make sense for s6-supervise to inform s6-svscan of its
>child process ID?

  No, it is a voluntary design decision that s6-svscan does not look
deeper into the service tree and only watches s6-supervise processes.
The goal is to keep s6-svscan as simple as can be, and it is a high-
value goal since s6-svscan is a suitable candidate for pid 1.

  Any transmission of information from s6-supervise to s6-svscan would
mean that s6-svscan now has to listen to N channels; any action to
take depending on the transmitted data means that s6-svscan now has
to implement some policy that is normally only the domain of
s6-supervise. This adds significant complexity, with a non-negligible
amount of failure cases, just for the goal of aiding recovery in case
an instance s6-supervise dies, which, again, is something that has never
happened yet.

  If you have dying s6-supervise processes, this is the thing you need
to fix in priority. The current s6 architecture will work in degraded
mode, which is obviously not ideal but it will still work; as an admin,
if something has killed one of your s6-supervise processes, you likely
have bigger problems to deal with than the new s6-supervise not being
able to start its service until the old one has died. If your old
service is still alive, then *you are still serving*, and s6 is doing
its job, despite something being very wrong on your machine.

  s6-supervise death is a problem of perception, and of anxiety. I know.
I feel that anxiety too. I wasn't sure how it was going to pan out when
I released s6 that way. But after 14 years of use, and 11 years of
daemontools use beforehand without a single supervisor death either,
I can confidently say that it's going to be all right.

  If I wanted to 100% prevent this from ever happening even in our
worst nightmares, rather than adding transmission channels between
s6-supervise and s6-svscan, I would simply write a single supervisor
like perpd, that watches N services at a time. That is how a lot of
init systems work, among which dinit, nitro, and of course systemd.
But cramming both s6-svscan and s6-supervise functionality into a
single process ends up in more code complexity overall than the
current s6 design, and I feel more confident (and less anxious) when
minimizing complexity. The current implementation is pretty optimal
when it comes to functionality/complexity ratio.

  Additionally, having s6-svscan and s6-supervise so loosely coupled
means that you can run an instance of s6-supervise in the wild, without
necessary being linked to an s6-svscan supervision tree. This is an
uncommon pattern, but it has come up once or twice for me (read: more
often than a supervisor's untimely death), and there are probably users
who rely on this being possible. I'd rather not break an existing
workflow unless it is proven necessary.

--
  Laurent

Received on Thu Oct 09 2025 - 00:06:34 CEST

This archive was generated by hypermail 2.4.0 : Thu Oct 09 2025 - 00:07:08 CEST