Welcome to End Point’s blog

Ongoing observations by End Point people

Starting processes at boot under SELinux

There are a few common ways to start processes at boot time in Red Hat Enterprise Linux 5 (and thus also CentOS 5):

  1. Standard init scripts in /etc/init.d, which are used by all standard RPM-packaged software.
  2. Custom commands added to the /etc/rc.local script.
  3. @reboot cron jobs (for vixie-cron, see `man 5 crontab` -- it is not supported in some other cron implementations).

Custom standalone /etc/init.d init scripts become hard to differentiate from RPM-managed scripts (not having the separation of e.g. /usr/local vs. /usr), so in most of our hosting we've avoided those unless we're packaging software as RPMs.

rc.local and @reboot cron jobs seemed fairly equivalent, with crond starting at #90 in the boot order, and local at #99. Both of those come after other system services such as Postgres & MySQL have already started.

To start up processes as various users we've typically used su - $user -c "$command" in the desired order in /etc/rc.local. This was mostly for convenience in easily seeing in one place what all would be started at boot time. However, when running under SELinux this runs processes in the init_t context which usually prevents them from working properly.

The cron @reboot jobs don't have that SELinux context problem and work fine, just as if run from a login shell, so now we're using those. Of course they have the added advantage that regular users can edit the cron jobs without system administrator intervention.


carl said...

From what I understand, the problem with @reboot cron entries is that they all fire at once, so that, for example, with the recent reboots of some of our in-house servers, many of the accounts firing up intensive daemon processes at the same time put quite a load on the servers. Ideally, one should use some kind of Highlander ("there can only be one!") mechanism to work around this issue. In perl I have done this by attempting to get a non-blocking exclusive lock on a file with flock, then executing my highlander code, finally releasing the lock. If an exclusive lock can not be obtained, sleep for a random number of seconds, then try again.

Jon Jensen said...

Carl, good point. The case I was describing involved only one user's crontab, so there was no difference. But if there were many crontabs under different users and they really did all fire at the same time, you're right that that'd bog things down.

Locking across different users would require coordinating the file name and ownership of the lock file, and as you note, some retrying.

Perhaps a simpler way would be in bash to sleep $(( $RANDOM / 1000 )) before starting a process, which would pause for 0-32 seconds randomly and should give breathing room.

Either way there's no escaping that a sysadmin needs to take a high-level view of what's going on at reboot time, so thanks for pointing that out.