3-node-cluster, more or less up to date: upgraded first server srv1 after migrating all VMs to srv2, rebooted srv1 .. nice
While migrating all VMs from srv2 to srv1 I was too fast and already started `apt upgrade`. Now the migration of the VMs seems to be in progress or not ... at least nothing happens.
The VMs are up and running.
I killed apt/dpkg (yes, I should have avoided that ...) and run:
it sits there for a while now
it looks good:
But it doesn't move ;-) ... for example I tried to shutdown stop VM 103 ... nothing happens.
What should I do here? Thanks for pointers.
My defensive approach would be to wait for the evening and reboot that node ... (when customers don't access their VMs)
While migrating all VMs from srv2 to srv1 I was too fast and already started `apt upgrade`. Now the migration of the VMs seems to be in progress or not ... at least nothing happens.
The VMs are up and running.
I killed apt/dpkg (yes, I should have avoided that ...) and run:
Bash:
root@srv2:~# dpkg --configure -a
Setting up pve-ha-manager (4.0.7) ...
watchdog-mux.service is a disabled or a static unit, not starting it.
it sits there for a while now
it looks good:
Bash:
# pvecm status
Cluster information
-------------------
Name: xyz
Config Version: 11
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Tue Apr 15 10:04:43 2025
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000005
Ring ID: 4.1e0
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000004 1 172.31.31.201
0x00000005 1 172.31.31.202 (local)
0x00000006 1 172.31.31.203
# ha-manager status
quorum OK
master srv1 (active, Tue Apr 15 10:03:29 2025)
lrm srv1 (active, Tue Apr 15 10:03:30 2025)
lrm srv2 (restart mode, Tue Apr 15 10:03:30 2025)
lrm srv3 (active, Tue Apr 15 10:03:31 2025)
service ct:107 (srv1, stopped)
service ct:109 (srv1, stopped)
service vm:100 (srv1, stopped)
service vm:103 (srv2, freeze)
service vm:108 (srv2, migrate)
service vm:110 (srv3, started)
service vm:111 (srv3, started)
[..]
But it doesn't move ;-) ... for example I tried to shutdown stop VM 103 ... nothing happens.
What should I do here? Thanks for pointers.
My defensive approach would be to wait for the evening and reboot that node ... (when customers don't access their VMs)