[SOLVED] Which services are safe to restart on a running system?

UdoB

Distinguished Member
Nov 1, 2016
1,927
783
213
Germany
After software updates I always do check for services to require a restart:
Code:
~# needrestart     # take a look and then select CANCEL (or use "-b -l")
...
Restarting services...
Service restarts being deferred:
systemctl restart corosync.service
systemctl restart pve-firewall.service
systemctl restart pve-ha-crm.service
systemctl restart pve-ha-lrm.service
systemctl restart pvedaemon.service
systemctl restart pveproxy.service
systemctl restart pvestatd.service

I am totally unsure which of these commands will affect the currently running VMs. https://pve.proxmox.com/wiki/Service_daemons lists all of them. But what happens if I actually do "systemctl restart pvexxx"?

Currently some man pages (e.g. https://pve.proxmox.com/pve-docs/pve-ha-crm.8.html) only show the syntax and a description like this:
Code:
DESCRIPTION
This is the Cluster Resource Manager Daemon.

That is all. This could easily get enhanced with some more information: What does it do? What happens if I do a restart? Will it stop/start/cripple dependent services and/or running VMs?


Anyway - regarding PVE I am a happy home user. Thank you, Proxmox, for making this product freely available!
 
Last edited:
I have unattended security upgrades and automatic needrestart enabled. It works flawlessy, but i still would not recommend to do that on a productive system. For homelab its fine.

However if you run upgrades manually over the gui and have automatic restart enabled it will often kill the gui, which in turn will kill your upgrade shell, so preffer to run upgrade within screen/tmux so it exits successfully and does not interfere with needrestart.

VM's would only be affected by qemu directly.

I upgraded almost every package by now, and no vm or container has been killed by it.


For example just today "libnss3" got upgraded automatically, mail report:
Unattended upgrade result: All upgrades installed

Packages that were upgraded:
libnss3

Package installation log:
Log started: 2020-07-18 06:47:08
---------------------------

--- Changes for nss (libnss3) ---
nss (2:3.42.1-1+deb10u3) buster-security; urgency=medium

* CVE-2019-17006 CVE-2019-17023 CVE-2020-12399 CVE-2020-12402

-- Moritz Mühlenhoff <jmm@debian.org> Wed, 08 Jul 2020 20:37:58 +0200

Preparing to unpack .../libnss3_2%3a3.42.1-1+deb10u3_amd64.deb ...
Unpacking libnss3:amd64 (2:3.42.1-1+deb10u3) over (2:3.42.1-1+deb10u2) ...
Setting up libnss3:amd64 (2:3.42.1-1+deb10u3) ...
Processing triggers for libc-bin (2.28-10) ...

Running kernel seems to be up-to-date.

Failed to check for processor microcode upgrades.

Restarting services...
systemctl restart pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvedaemon.service pveproxy.service pvestatd.service

No containers need to be restarted.

No user sessions are running outdated binaries.
Log ended: 2020-07-18 06:47:19
 
Last edited:
  • Like
Reactions: UdoB
In a different post, I was told that after an update via the gui, any services that needed a restart would be automatically restarted.....hmmm....

In that case I'll use the GUI more often now. As I have two machines (and a small third one for quorum) I should compare the results of GUI vs. apt-get after next updates...

Thanks for the hint!
 
Well, although I said GUI here, I didn't specify it in my original question. I don't think it should matter. All the gui does is call apt-get dist-upgrade.
I think the implication was that if something was updated that needed to be restarted (e.g. the gui itself or qemu-server or whatever) then it would be restarted by the post-update script in the package.

I'm sure this would not include libraries, which would not include scripts to restart anything that uses them because it could cause chaos.

My original question was basically whether I needed to manually restart anything after an update - because I don't as a rule - and I wanted to make sure I wasn't fooling myself into thinking I was more secure than I really was.
 
The result was not exactly what I hoped.

This is a four hosts cluster. One is cold-standby. I did not modify "expected", so quorum expected is 3, and three machines are usually powered on. Probably I should specify "expected 2" in this situation, but i did not.

I was able to run systemctl restart corosync.service pvedaemon.service pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvestatd.service directly from "needrestart" on one first cluster member without problems. VMs kept running, no side effects.

When I run the exact same command on my main host it rebootet this one and immediately also the second production host!

These two form a HA-group, so my current understanding is:
  • no validated understanding why the "needrestart"-host rebootet
  • my second HA-host will reboot if the first one is unavailable for a few seconds
I hope this is true only because my quorum dropped down below 3 (down to 2) during that "restart"-command at least for a few seconds.

This was done via ssh console. Next time I will use the Gui and hope the best - this is a homelab, nothing mission critical :cool:

Best regards
 
  • Like
Reactions: shabsta
The result was not exactly what I hoped.

This is a four hosts cluster. One is cold-standby. I did not modify "expected", so quorum expected is 3, and three machines are usually powered on. Probably I should specify "expected 2" in this situation, but i did not.

I was able to run systemctl restart corosync.service pvedaemon.service pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvestatd.service directly from "needrestart" on one first cluster member without problems. VMs kept running, no side effects.

When I run the exact same command on my main host it rebootet this one and immediately also the second production host!

These two form a HA-group, so my current understanding is:
  • no validated understanding why the "needrestart"-host rebootet
  • my second HA-host will reboot if the first one is unavailable for a few seconds
I hope this is true only because my quorum dropped down below 3 (down to 2) during that "restart"-command at least for a few seconds.

This was done via ssh console. Next time I will use the Gui and hope the best - this is a homelab, nothing mission critical :cool:

Best regards

Sounds like corosync.service took to long to restart and triggered ha, better exclude it then.
 
Well... today I had the hosts in my unmodified cluster (4-Nodes, 1 offline, Quorum = 3) waiting for a reboot. Two of the Nodes do establish a HA group.

Just for my persistent memory: I did "Reboot" the third active node via the Web-Gui without any pre-preparation - without decreasing "expected".

ALL THREE Nodes rebootet

Actually I can live with that (Homelab). But I am still/again surprised. Todo: Next time I shall activate the fourth node first...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!