Forcing insserv to start sshd early

Many distributions are using the `insserv` based dependency following at boot time.  After a bit of searching, I found very little actual documentation on the subject.  Here’s the process:

  1. Add override files to /etc/insserv/override/
  2. The files must contain ‘### BEGIN INIT INFO’ and ‘### END INIT INFO’, else insserv will ignore them.
  3. Some have indicated that you can override missing LSB fields with this method, however, it does require the Default-Start and Default-Start options even though you wouldn’t expect to need to override those.
  4. The name of the file in /etc/insserv/override must be equal to the name in /etc/init.d *not* the name it “Provides:”.  In an ideal world, the name would be the same as provides—but in this case that isn’t always so.

For my purpose, I created overrides for all of my services in rc2.d with this script.  Note that the overrides are just copies of the content  from the /etc/init.d/ scripts:

cd /etc/rc2.d
# This is one long line; $f is filename, $p is the Provides value.
grep Provides * | cut -f1,3 -d: | tr -d : | while read f p; do perl -lne '$a++ if /BEGIN INIT INFO/; print if $a; $a-- if /END INIT INFO/' $f > /etc/insserv/overrides/$p;done

Note that the script writes the filename from the “Provides” field so you may need to change the filename if you have initscripts where /etc/init.d/script doesn’t match the Provides field.  Notably, Debian Wheezy does not follow this for ssh.  Provides is sshd, but the script is named ssh.

Next, I append sshd to the Require-Start line of all of my overrides:

cd /etc/insserv/overrides/
perl -i -lne 's/(Required-Start.*)$/$1 sshd/; print' *

This of course creates a cyclic dependency for ssh, so fix that one up by hand.  Feel free to make any other boot-order preferences while you’re in the overrides directory.  For this case, ssh  was made dependent on netplug.

Finally, run `insserv` and double-check that it did what you expected:

# cat /etc/init.d/.depend.start
TARGETS = rsyslog munin-node killprocs motd sysfsutils sudo netplug rsync ssh mysql openvpn ntp wd_keepalive apache2 bootlogs cron stop-readahead-fedora watchdog single rc.local rmnologin
INTERACTIVE =
netplug: rsyslog
rsync: rsyslog
ssh: rsyslog netplug
mysql: rsyslog ssh
ntp: rsyslog ssh
[...snip...]

Viola!  Now I can ssh to the host far earlier, and before services that can take a long time to start to troubleshoot in case of a problem.  In my opinion, ssh should always run directly after the network starts.

-Eric

 

Linux Kernel bug from 2002?

Really Old Bugs

Apparently there is a bug from kernels as old as 2.5.44 that pop up every so often causing hours of work for developers to hunt down.  Hopefully it can be fixed upstream, or maybe this is a “won’t fix” for some very good reason that I am unaware of:  http://osdir.com/ml/linux.enbd.general/2002-10/msg00176.html .  In my opinion, an issue like this should give some meaningful error rather than causing deadlock.

 

The fix

Basically add_disk (and therefore register_disk() where the problem actually resides) must be called *before* set_capacity() in Linux block device drivers.  This is backwards of the way I would think, as I would configure the device parameters before publishing it into userspace—but that is backwards in the Linux kernel and can (will?) cause deadlock.

Upstream

Recently I encountered this issue/bug in a zfs-git (zfsonlinux) build.  I’ve resolved the kernel hang and I’m working on a minimal patch for ZFS.  For now follow this ZFS ZVOL issue on github: https://github.com/zfsonlinux/zfs/issues/1488 .

Update: a pull request is pending here: https://github.com/zfsonlinux/zfs/pull/1491 and a patch has been listed on the issues page.

 

-Eric