Systemd – find what’s wrong with systemctl

True: all the last changes in Linux distro didn’t make me really really happy.
I still like to use init.d to start a process (it took me a while to get used to service yourservice status syntax) and so.

Anyway, the main big ones don’t seem to look back, and we need to get used to this 🙂

I have few raspberry PIs at home, and I’ve noticed that after a restart I was experiencing different weird behaviours. The main two:

  • stuck and not rebooting
  • receiving strange logrotate email alerts (e.g. /etc/cron.daily/logrotate:
    gzip: stdin: file size changed while zipping)

I tried to ignore them, but when you issue a reboot from a remote place and it doesn’t reboot, you understand that you should start to check what’s going on, instead of just unplug-replug your PI.

 

And here the discovery: systemctl

This magic command was able to show me the processes with issues, and slowly find out what was wrong with logrotate or my reboot. Or, better, I have realised that after fixing what was marked as failed, I didn’t experience any weird behaviors.

So, here few steps that I’d like to share – to help maybe someone else in the future, myself included – as I tend to forget things if I don’t use them 🙂

To check if your system is healthy or not:

systemctl status

Output should return “running”. If you get “degraded”, well, there is definitely something wrong.

Use the following to check what has failed:

systemctl --failed

Now, investigate those specific processes. Try to analyse their status and logs or literally try to restart them to see live what is the error:

systemctl status <broken_service>

journalctl _PID=<PID_of_broken_service>

tail /var/log/<broken_service>

systemctl restart <broken_service>

 

After fixing all, I tried to reboot few times and after I was checking again the overall status to make sure it was “running”.

In my case, I had few issues with “systemd-modules-load.service”. This probably related to my dist-upgrade. Some old and no longer existing modules were still listed in /etc/modules and, of course, the service wasn’t able to load them, miserably failing.
I’ve tested each module using modprobe <module_name> and I’ve commented out the ones where failing. Restarted and voila`, status… running!

On another PI I had some issues with Apache, but I can’t remember how I fixed it. Still, the goal of this post is mostly make everyone aware that systemctl can give you some interesting info about the system and you can focus your energies on the failed services.

I admit in totally honesty that I have no much clue why after fixing these failed services, all issues disappeared. In fact, the reboot wasn’t affecting one PI with the same non-existing modules listed, but it was stopping another one during the boot. Again, I could probably troubleshoot further but I have a life to live as well 🙂

 

Sources: