- 2018-04-08: VM crash at Gandi.
It can happen, no big deal. Except that alyss does not boot back. It
was a Sunday, so no help from tech support. The next day, tech support
takes a couple hours to answer but points me to the right direction:
the kernel boots, but can't find the rootfs. Investigation shows Gandi
changed the way their Xen PV installation presents hard disks to the
guests, so my grub configuration was obsolete. Problem solved after
about 24 hours of downtime. My main gripe here is that I cannot make
sure that it doesn't happen again: if disk configuration changes again,
I have to modify the grub.cfg by hand, unless I install the whole
gandi-vm-config machinery that is a Python monster and a way for Gandi
to backdoor your machine as they please - which I obviously won't do.
- 2016-11-05: scheduled downtime for maintenance: alyss was
switched from Gandi's Xen hypervisor
infrastructure to their new KVM hypervisor infrastructure. I could
not make the "boot on a raw disk and have your custom kernel" feature
work, so it's still using a stock Gandi kernel for now.
- 2013-09-02: switch from antah to alyss, a virtual server at
Gandi. A few hiccups while fixing the
last bugs, but no major downtime. Complete switch to a homemade
distribution. No more hardware failures, no more distribution failures,
no more OpenSSH failures. The future is bright!
- 2007-07-20: Antah hardware failure. For several reasons, there's
one month of downtime. My apologies. Read the story
- 2006-02-28: Power failure at RedBus. antah doesn't boot when the
power comes back. Analysis shows that the last Debian upgrade has messed up
lilo configuration, and the kernel can't be found. Sigh. And they ask why
I don't trust Linux distributions.
Lilo installed manually, problem fixed.
Kernel upgraded to 18.104.22.168.
- 2005-03-04: antah's main disk has been having major problems
for a few days. I go to RedBus, take the disk home, and dump it onto
another one before it's too late. The machine is back up on 2005-03-06,
9h50 (GMT+1). Kernel upgraded to 2.6.11.
- 2004-08-16, 17h (GMT+2): unable to login, so I immediately go
to RedBus and reboot. I can then login and analyze. Problem:
sshd didn't like /dev/pts/100. Linux developers pretend
it's a userland problem and OpenSSH developers pretend it's a Linux
kernel problem. Great. Kernel upgraded to 22.214.171.124, I'll try to write
a workaround to the /dev/pts/100 problem before it arises again.
- 2004-05-04, 9h50 (GMT+2) to 2004-05-07, 16h00 (GMT+2):
scheduled ISP change. The whole story can be read
- 2004-02-25: 13h30 - 18h30 (Paris time, GMT+1):
scheduled kernel and boot system upgrade. No problems.
- 2003-08-27: the whole skarnet.org site was down, to support the
of the FFII against software patents.
Downtime from 2003-08-27 at 03:00 GMT+2 to 2003-08-28 at 15:00 GMT+2.
Kernel upgraded to 2.4.22, init system upgraded.
- 2003-02-19: 6h - 8h (Paris time, GMT+1): scheduled electrical
upgrade. ClaraNet warns their users only a day in advance, pretending
that the previous upgrade was incomplete and must be fixed
immediately. The outage actually starts at 6:10 and ends at 10:35.
- 2002-12-12: 6h - 8h (Paris time, GMT+1): scheduled electrical
upgrade. Kernel upgraded to 2.4.20. No boot problems.
- 2002-10-08: Power outage at ClaraNet.
Antah doesn't boot properly when the power comes back. Cause: modutils failure -
hardcoded /sbin paths in binaries and in kernel. Sigh. Upgraded to
2.4.19, without module support; modutils thrown out.
None planned for the moment.