[SOLVED] Which services are safe to restart on a running system?

UdoB · Jul 18, 2020

After software updates I always do check for services to require a restart:

Code:

~# needrestart     # take a look and then select CANCEL (or use "-b -l")
...
Restarting services...
Service restarts being deferred:
systemctl restart corosync.service
systemctl restart pve-firewall.service
systemctl restart pve-ha-crm.service
systemctl restart pve-ha-lrm.service
systemctl restart pvedaemon.service
systemctl restart pveproxy.service
systemctl restart pvestatd.service

I am totally unsure which of these commands will affect the currently running VMs. https://pve.proxmox.com/wiki/Service_daemons lists all of them. But what happens if I actually do "systemctl restart pvexxx"?

Currently some man pages (e.g. https://pve.proxmox.com/pve-docs/pve-ha-crm.8.html) only show the syntax and a description like this:

Code:

DESCRIPTION
This is the Cluster Resource Manager Daemon.

That is all. This could easily get enhanced with some more information: What does it do? What happens if I do a restart? Will it stop/start/cripple dependent services and/or running VMs?

Anyway - regarding PVE I am a happy home user. Thank you, Proxmox, for making this product freely available!

H4R0 · Jul 18, 2020

I have unattended security upgrades and automatic needrestart enabled. It works flawlessy, but i still would not recommend to do that on a productive system. For homelab its fine.

However if you run upgrades manually over the gui and have automatic restart enabled it will often kill the gui, which in turn will kill your upgrade shell, so preffer to run upgrade within screen/tmux so it exits successfully and does not interfere with needrestart.

VM's would only be affected by qemu directly.

I upgraded almost every package by now, and no vm or container has been killed by it.

For example just today "libnss3" got upgraded automatically, mail report:

Unattended upgrade result: All upgrades installed

Packages that were upgraded:
libnss3

Package installation log:
Log started: 2020-07-18 06:47:08
---------------------------

--- Changes for nss (libnss3) ---
nss (2:3.42.1-1+deb10u3) buster-security; urgency=medium

* CVE-2019-17006 CVE-2019-17023 CVE-2020-12399 CVE-2020-12402

-- Moritz Mühlenhoff <jmm@debian.org> Wed, 08 Jul 2020 20:37:58 +0200

Preparing to unpack .../libnss3_2%3a3.42.1-1+deb10u3_amd64.deb ...
Unpacking libnss3:amd64 (2:3.42.1-1+deb10u3) over (2:3.42.1-1+deb10u2) ...
Setting up libnss3:amd64 (2:3.42.1-1+deb10u3) ...
Processing triggers for libc-bin (2.28-10) ...

Running kernel seems to be up-to-date.

Failed to check for processor microcode upgrades.

Restarting services...
systemctl restart pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvedaemon.service pveproxy.service pvestatd.service

No containers need to be restarted.

No user sessions are running outdated binaries.
Log ended: 2020-07-18 06:47:19

Faris Raouf · Jul 18, 2020

In a different post, I was told that after an update via the gui, any services that needed a restart would be automatically restarted.....hmmm....

UdoB · Jul 18, 2020

H4R0 said:
I have unattended security upgrades and automatic needrestart enabled. It works flawlessy, but i still would not recommend to do that on a productive system. For homelab its fine.

Thanks for that information!

UdoB · Jul 18, 2020

Faris Raouf said:
In a different post, I was told that after an update via the gui, any services that needed a restart would be automatically restarted.....hmmm....

In that case I'll use the GUI more often now. As I have two machines (and a small third one for quorum) I should compare the results of GUI vs. apt-get after next updates...

Thanks for the hint!

Faris Raouf · Jul 18, 2020

Well, although I said GUI here, I didn't specify it in my original question. I don't think it should matter. All the gui does is call apt-get dist-upgrade.
I think the implication was that if something was updated that needed to be restarted (e.g. the gui itself or qemu-server or whatever) then it would be restarted by the post-update script in the package.

I'm sure this would not include libraries, which would not include scripts to restart anything that uses them because it could cause chaos.

My original question was basically whether I needed to manually restart anything after an update - because I don't as a rule - and I wanted to make sure I wasn't fooling myself into thinking I was more secure than I really was.

UdoB · Jul 19, 2020

The result was not exactly what I hoped.

This is a four hosts cluster. One is cold-standby. I did not modify "expected", so quorum expected is 3, and three machines are usually powered on. Probably I should specify "expected 2" in this situation, but i did not.

I was able to run

systemctl restart corosync.service pvedaemon.service pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvestatd.service

directly from "needrestart" on one first cluster member without problems. VMs kept running, no side effects.

When I run the exact same command on my main host it rebootet this one and immediately also the second production host!

These two form a HA-group, so my current understanding is:

no validated understanding why the "needrestart"-host rebootet
my second HA-host will reboot if the first one is unavailable for a few seconds

I hope this is true only because my quorum dropped down below 3 (down to 2) during that "restart"-command at least for a few seconds.

This was done via ssh console. Next time I will use the Gui and hope the best - this is a homelab, nothing mission critical

Best regards

H4R0 · Jul 19, 2020

UdoB said:
The result was not exactly what I hoped.

This is a four hosts cluster. One is cold-standby. I did not modify "expected", so quorum expected is 3, and three machines are usually powered on. Probably I should specify "expected 2" in this situation, but i did not.

I was able to run systemctl restart corosync.service pvedaemon.service pve-firewall.service pve-ha-crm.service pve-ha-lrm.service pvestatd.service directly from "needrestart" on one first cluster member without problems. VMs kept running, no side effects.

When I run the exact same command on my main host it rebootet this one and immediately also the second production host!

These two form a HA-group, so my current understanding is:

no validated understanding why the "needrestart"-host rebootet

my second HA-host will reboot if the first one is unavailable for a few seconds

I hope this is true only because my quorum dropped down below 3 (down to 2) during that "restart"-command at least for a few seconds.

This was done via ssh console. Next time I will use the Gui and hope the best - this is a homelab, nothing mission critical

Best regards

Sounds like corosync.service took to long to restart and triggered ha, better exclude it then.

UdoB · Aug 9, 2020

Well... today I had the hosts in my unmodified cluster (4-Nodes, 1 offline, Quorum = 3) waiting for a reboot. Two of the Nodes do establish a HA group.

Just for my persistent memory: I did "Reboot" the third active node via the Web-Gui without any pre-preparation - without decreasing "expected".

ALL THREE Nodes rebootet

Actually I can live with that (Homelab). But I am still/again surprised. Todo: Next time I shall activate the fourth node first...

Search

Search

[SOLVED] Which services are safe to restart on a running system?

UdoB

Distinguished Member

H4R0

Well-Known Member

Faris Raouf

Well-Known Member

UdoB

Distinguished Member

UdoB

Distinguished Member

Faris Raouf

Well-Known Member

UdoB

Distinguished Member

H4R0

Well-Known Member

UdoB

Distinguished Member

We value your privacy