Re: First time caller to the show - am I understanding the fifo trick correctly?

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Wed, 25 Aug 2021 11:57:11 +0000

>Forgiving privilege separation failures and minor grammatical mistakes, does it look as if I understand the fifo trick's application in practice?

  Hi Ellenor,
  Yes, I think you have the right idea.

  The goal here is to redirect s6-svscan's own stdout and stderr to
the stdin of the catch-all logger process, so that the supervision
tree's messages, and the messages from every service that lacks a
dedicated logger, go to the catch-all logger instead of /dev/console.
(Because /dev/console is a terrible default place to send logs and
should only be used for very critical messages such as kernel panics,
or, in our userland case, for catch-all logger failures.)

  The problem is that we want the catch-all logger to run as a service
under the supervision tree, so the s6-log process does not exist yet
when we exec into s6-svscan: it will be spawned later as a grandchild
of s6-svscan (with an s6-supervise intermediary). So we cannot use an
anonymous pipe for this.

  We use a fifo instead: we can redirect init's stdout and stderr to
a fifo, and later on, when the catch-all logger starts, we can
instruct it (in its run script) to read from the fifo.

  But the Unix fifo semantics say that we *cannot* open a fifo for
writing while there is no reader: open() would either block (default
flags) or return -1 ENXIO (with O_NONBLOCK). So the "fifo trick" is:
1. open the fifo for reading
2. open it for writing, which now works
3. close the reading end

At this point, any write() to the fifo will fail with -1 EPIPE. That is
not a problem per se, except it will also generate a SIGPIPE, so in
order to avoid crashing and burning, it is important to ignore SIGPIPE
at the very least - or, better, to make sure that no process writes to
the fifo until the catch-all logger is up. This is the case for
s6-svscan
and s6-supervise, so our system structure is safe; but we need to make
sure that no other process starts before the catch-all logger is up,
else they will just eat a SIGPIPE and die.

  In the s6-l-i model, s6-svscan is executed as soon as possible, on a
very minimal supervision tree that only contains the catch-all logger
and a few other essential "early services" (such as the shutdown daemon
and an early getty). All the rest of the initialization is done in
"stage 2 init", which is a script run as a child of s6-l-i's.
So the end of the "fifo trick" uses the Unix fifo semantics as a
synchronization mechanism:
4. fork
5. In the child, close our fd to the fifo
6. In the child, open the fifo for writing once again,
    *without* O_NONBLOCK.

  This last open() will block until the fifo has a reader. That
ensures the child will only resume once the parent has completed
its work and executed into s6-svscan, and the supervision tree has
started and the catch-all logger is running. Then the child can exec
into stage 2 init and perform the rest of the work with the guarantee
that the supervision tree is operational and all the stdout and stderr
messages go to the catch-all logger by default.

  To see exactly how to implement stage 1 init and the fifo trick as
an execline script, you can checkout (or download) any version of
s6-l-i *prior to* 1.0.0.0; try version 0.4.0.1, downloadable from
skarnet.org if you type the URL by hand, and accessible via the
v0.4.0.1 tag in git. It is very different from what it is now, as in
there is no sysv compatibility at all, but stage 1 should be
understandable.

  A few months ago, I tried adding a few conditional compilation options
to s6-l-i to make it work under FreeBSD, but unfortunately the
organization of the FreeBSD init is so different from Linux's,
especially shutdown-wise, that my attempt only succeeded in turning
the package into an unholy plate of spaghetti. At some point in the
future, however, a similar-but-separate s6-freebsd-init package may
make sense.

--
  Laurent
Received on Wed Aug 25 2021 - 13:57:11 CEST

This archive was generated by hypermail 2.4.0 : Wed Aug 25 2021 - 13:57:42 CEST