VM crash after update

Oct 9, 2024
132
3
18
Hi everyone, I'm in trouble, I updated from version pve-manager/8.3. 5/dac3aa88bac3f300, to version pve-manager/8.4. 1/2a5fa54a8503f96d, then I rebooted as requested, the vm migrated, but when I return to the updated ost they don't start, and if I try to restart them they don't start, what could have happened, please help me I'm ruined.

in HA state I see on the updated node: wait for agent lock
 
Last edited:
root@pve3:~# ha-manager status
pvecm status
quorum OK
master pve2 (old timestamp - dead?, Sun Apr 20 00:35:14 2025)
lrm pve1 (active, Sun Apr 20 02:55:50 2025)
lrm pve2 (active, Sun Apr 20 02:55:58 2025)
lrm pve3 (wait_for_agent_lock, Sun Apr 20 02:55:58 2025)
service vm:103 (pve1, started)
service vm:105 (pve1, started)
service vm:106 (pve2, started)
service vm:107 (pve2, started)
service vm:109 (pve1, started)
service vm:110 (pve2, started)
service vm:111 (pve1, started)
service vm:112 (pve2, started)
service vm:113 (pve1, started)
service vm:114 (pve2, started)
service vm:115 (pve1, started)
service vm:116 (pve2, started)
service vm:117 (pve1, started)
service vm:118 (pve2, started)
service vm:119 (pve1, started)
service vm:120 (pve2, started)
service vm:121 (pve1, started)
service vm:122 (pve3, fence)
service vm:128 (pve1, started)
service vm:129 (pve1, started)
service vm:130 (pve3, fence)
service vm:131 (pve1, started)
service vm:132 (pve2, started)
service vm:133 (pve3, fence)
service vm:134 (pve3, fence)
service vm:135 (pve3, fence)
service vm:139 (pve1, started)
service vm:141 (pve1, started)
service vm:144 (pve2, started)
service vm:145 (pve2, started)
service vm:146 (pve3, fence)
service vm:147 (pve2, started)
service vm:148 (pve2, started)
service vm:152 (pve1, started)
service vm:155 (pve3, deleting)
Cluster information
-------------------
Name: ETSSISTEMI
Config Version: 3
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Sun Apr 20 02:55:59 2025
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000003
Ring ID: 1.242
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.21.151
0x00000002 1 192.168.21.152
0x00000003 1 192.168.21.153 (local)
root@pve3:~#

the vms that are in the done state on the pv3 node do not start, how do I change the state?

I tried to remove the VM 155 in HA but it gives me an error, and it doesn't delete it
 
Last edited:
I assumed you have a shared storage? Maybe you can try to force shutdown pve3 and wait for more than 2 mnutes and see if the VMs are started in other remaining nodes or not
 
the problem was caused that at the time I rebooted there were too many VMs to migrate, and it had not finished migrating them all before rebooting, how can I avoid this problem? is there some time I can change?
 
Turn on maintenance mode on pve host and the vm/lxc with ha definition would auto-migrate and if done reboot your server. After it's up again disable mantenance mode of the node.
 
Unfortunately there is no maintenace on/off button in the UI yet but hopefully in any future release so must be used by cli until.