This post may seem like it should be blatantly obvious, but in the last month alone, I’ve heard of numerous people using screen and/or cron to keep daemons alive. Worse, there’s even semi-official guides out there that still recommend it, even in this day and age.
Now, don’t get me wrong, screen is absolutely awesome (though I prefer the newer tmux) for what it’s meant to do, such as multiplexing terminals and providing window management of those terminals. But it is not designed to control and watch over your daemon processes. For example, it doesn’t manage your logfiles, it won’t respawn a crashed program and it’s not going to come up by itself after a reboot, either.
Thankfully however, there are a lot of modern tools that are designed specifically for this purpose.
Supervisord is a “client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.” It’s a daemon that’s started like all the others on your system by init, but in turn manages other processes for you through simple configuration files, which look like this:
# /etc/supervisor/conf.d/err.conf [program:err] directory=/home/err/repository command=/home/err/virtualenv/bin/python /home/err/repository/scripts/err.py --config "/home/err" --xmpp autostart=true autorestart=true startsecs=10 stopwaitsecs=60 redirect_stderr=true stdout_logfile=/var/log/supervisor/err.log stderr_logfile=None stdout_logfile_maxbytes=150MB stdout_logfile_backups=0 user=err environment=HOME=/home/err,USER=err
It provides a command-line program to manage the programs under it’s control, which you can see below, as well as many other features, including an API in case you wanted to integrate it with other systems.
$ supervisorctl status devpi-server RUNNING pid 1517, uptime 19 days, 1:43:17 err RUNNING pid 1503, uptime 19 days, 1:43:17 munin-fcgi-graph RUNNING pid 1504, uptime 19 days, 1:43:17 munin-fcgi-html RUNNING pid 1495, uptime 19 days, 1:43:17 $ supervisorctl restart devpi-server devpi-server: stopped devpi-server: started
Upstart is “an event-based replacement for the /sbin/init daemon which handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. It was originally developed for the Ubuntu distribution, but is intended to be suitable for deployment in all Linux distributions as a replacement for the venerable System-V init.”
It’s one of many init replacements, systemd and OpenRC being some other examples. I’m a Ubuntu user so for me, using upstart makes a lot of sense in my infrastructure. I use upstart rather than supervisor when it comes to system-level daemons, especially those where it’s nice to be able to control where in the boot process they get started. Upstart scripts are just as simple to write as supervisor entries, though.
# /etc/init/uwsgi-emperor.conf description "uWSGI Emperor" start on runlevel  stop on runlevel  respawn respawn limit 10 5 pre-start script [ -e /var/run/uwsgi-emperor ] || mkdir /var/run/uwsgi-emperor chmod 1777 /var/run/uwsgi-emperor end script exec uwsgi --logto /var/log/uwsgi-emperor.log --log-date --thunder-lock --die-on-term --emperor /etc/uwsgi/apps-enabled
The two I just highlighted specifically are merely the tip of the iceberg. There’s also monit, daemontools, circus and runit, just to name a few. All that matters is you should be using one of the many tools designed specifically for this purpose, rather than hacking together fragile solutions with tmux, screen or cron.