Some comments/questions about s6-rc semantics from Avery Pennarun on 2016-07-11 (supervision)

From: Avery Pennarun <apenwarr_at_gmail.com>
Date: Mon, 11 Jul 2016 19:32:27 -0400

Hi all,

I recently started doing an experiment to convert our buildroot-based
system to use s6-rc as an init system. On the whole, it seems very
nice, but there's a collection of things that seemed awkward to me.
I'm wondering if there's a cleaner way for me to do what I want, or
whether there are plans to improve some of these seeming warts in the
future.

Let me just launch into my laundry list:

- Logs in s6-rc are weird and inconvenient compared to the equivalent
in plain s6 (or daemontools). In particular, I have to manually
redirect fd#2, I have to create a whole separate service for each
logger, and I have to create consumer-for/producer-for files. The
whole producer/consumer/pipeline design seems a little excessively
general. Wouldn't it be nicer to just allow a log/ subdir like s6
would, and if I really don't like it, *then* let me play around with
pipelines?

- Oneshots don't seem to get either pipelines or loggers. I have some
relatively complex oneshots, and I would really like to capture their
logs, if only to prefix them with the name of the oneshot.

- Oneshot up/down scripts *must* be execline, while longrun run/finish
scripts can use #! to specify any interpreter. Why the discrepancy?
Many of my scripts are quite complicated, and this just forces me to
use another layer of indirection, for reasons I don't understand.

- What's the rationale for using fd load/store operations between
pipeline elements, instead of just a mkfifo like daemontools uses?
The latter is much simpler and doesn't require a separate, error prone
daemon.

- It would be really nice if I could provide a data/ directory in
oneshots and longruns, like I can with an s6 service. Especially
since up/down must be executed using execline, it would be nice if I
could at least direct them to run a full script in another file nearby
in the file tree, for clarity.

- It's unclear in the docs how the s6-rc/compiled/ directory is
supposed to be replaced. For testing, I have my compiled/ directory
mounted over NFS. I regenerate it by moving it aside and generating a
new one with s6-rc-compile (both running on the NFS server). Then I
run s6-rc-update on the NFS client, and it gets angry (presumably
because the old compiled/ contents are no longer reachable). Compiling
into a separate directory and moving a symlink fixes this, but is
really awkward if I want to generate the output using a makefile (and
have reproducible output from my build system, ie. no random numbers
or timestamps). Is there a way to, say, have s6-rc hold the file open
(by inode) so that even if I rename or delete the original,
s6-rc-update can still compare old and new revisions?

- It's unclear whether the "s6-rc change" command is properly
re-entrant. Imagine if I have a bundle called "all" that contains all
my basic services. I run 's6-rc -u change all' which starts bringing
them up. While bringing up X, it realizes that in order to continue,
it must first bring up Y, which is part of 'all', but not listed as a
dependency of X (since X doesn't *always* require Y). I would like X
to be able to run 's6-rc -u change Y' to have it wait for Y, then
continue. I think this won't work because the running 's6-rc -u
change all' owns the locks. Is there a way to change the locking
mechanism to avoid the deadlock?

- I need the ability to "atomically cycle" a service (eg. because its
config files have changed). For oneshots, that means down+up, and for
longruns, that means terminate+run. By atomically, I mean if two
different callers try to cycle the service at once, one should finish
before the next one does. Or even better, I'd like "at least once"
semantics: let's say A and B both add new files to /etc/daemon.d.
They both try to restart service X at the same time. There's no need
to actually restart X twice; we just need a guarantee that the most
recent start (not stop) happened *after* the later of A and B asked
for a restart, and then both restart requests can finish
simultaneously. Is there a good way to build these semantics around
the current system?

- Relatedly, I would like a command similar to s6-wait that works for
any s6-rc service. It's fairly easy to translate a longrun into an s6
service and just use s6-wait directly. However, that doesn't work
with oneshots, and I would quite often like to wait for a oneshot to
complete.

- It doesn't seem to be clearly documented what signal(s) s6-rc uses
to stop services. It also doesn't give much flexibility in what kind
of status you wait for. By comparison, s6-svc gives all the
flexibility I'd like.

Sorry for the long email, but I thought I'd get it all off my chest at
once. I'm really liking s6 and s6-rc so far, but it feels like if
these semantic issues could be cleared up, it would be that much more
elegant.

Thanks!

Have fun,

Avery
Received on Mon Jul 11 2016 - 23:32:27 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC