shutdown all nodes of a multi-node cluster

lknite

Member
Sep 27, 2024
74
5
8
I need to shutdown my whole cluster in order to make some power supply changes.

There are three proxmox servers.

I clicked on one proxmox server at a time and selected to shutdown.

The first shut down.

The other two seemed to shutdown over time, but then I noticed all the vms started back up (on the two servers).

I do have many vms setup as highly available. Could this be why?

What's the easiest way to get things fully shutdown so I can replace a power supply?
 
Last edited:
Here are several links and some notes. I suppose this is skewed towards Ceph clusters since that's what we had.
I had found instructions to disable HA:

systemctl stop pve-ha-crm pve-ha-lrm

However at some point nodes would not finish stopping these services (I would guess, a quorum issue), so that did not help. So I just shut down the VMs...if done from the PVE GUI, HA knows they are supposed to be stopped.

For Ceph it's a bit more complicated:

Bring down clients. (all VMs)

Once all clients, VMs and containers are off or not accessing the Ceph cluster anymore, verify that the Ceph cluster is in a healthy state.

Set Ceph OSD flags:
ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover
ceph osd set norebalance
ceph osd set nodown
ceph osd set pause

Per the PVE doc above:
"Start powering down your nodes without a monitor (MON). After these nodes are down, continue by shutting down nodes with monitors on them.
When powering on the cluster, start the nodes with monitors (MONs) first. Once all nodes are up and running, confirm that all Ceph services are up and running.
...You can now start up the guests."
 
  • Like
Reactions: complexplaster27