Hello,
Our nodes were rebooted after the last pve kernel upgrades. Some issues were encountered with migration of VMs however we got around them. They were stuck migrating.
Yesterday we needed to reboot a node again and I noticed that one vm still had the migrating icon. After waiting an hour or so and still seeing it in the migration mode I proceeded with rebooting the node. When the node came back on it was greyed out in an error state.
Next I tried starting , that failed. this was in the task viewer
- so try remove from HA
- then start
- from cli try to unlock. [i wish unlock was at pve web page]
that is bad. never happened before
So the config for 125 is missing.
is it at another node? checked with this:
Nope. 125.conf is not at another node
so try restoring a backup.
1- find backups
the 'locate' database is out of sync. it updates every day usually.
this one is recent
So cp the backup:
'File exists' ? yet it does not show up on ls :
Our nodes were rebooted after the last pve kernel upgrades. Some issues were encountered with migration of VMs however we got around them. They were stuck migrating.
Yesterday we needed to reboot a node again and I noticed that one vm still had the migrating icon. After waiting an hour or so and still seeing it in the migration mode I proceeded with rebooting the node. When the node came back on it was greyed out in an error state.
Next I tried starting , that failed. this was in the task viewer
Code:
Requesting HA start for VM 125
service 'vm:125' in error state, must be disabled and fixed first
TASK ERROR: command 'ha-manager set vm:125 --state started' failed: exit code 255
- so try remove from HA
- then start
Code:
TASK ERROR: VM is locked (migrate)
- from cli try to unlock. [i wish unlock was at pve web page]
Code:
root@pve4:[~]:# pct unlock 125
Configuration file 'nodes/pve4/lxc/125.conf' does not exist
that is bad. never happened before
So the config for 125 is missing.
is it at another node? checked with this:
Code:
root@pve4:[~]:# l /etc/pve/nodes/*/lxc/
/etc/pve/nodes/pve10/lxc/:
/etc/pve/nodes/pve11/lxc/:
122.conf 128.conf 501.conf 605.conf 7012.conf
/etc/pve/nodes/pve13/lxc/:
/etc/pve/nodes/pve14/lxc/:
/etc/pve/nodes/pve15/lxc/:
/etc/pve/nodes/pve2/lxc/:
105.conf 109.conf 110.conf 119.conf 121.conf 2123.conf 502.conf 602.conf 603.conf
/etc/pve/nodes/pve3/lxc/:
/etc/pve/nodes/pve4/lxc/:
113.conf 120.conf 160.conf 606.conf
/etc/pve/nodes/pve5/lxc/:
/etc/pve/nodes/pve6/lxc/:
/etc/pve/nodes/pve7/lxc/:
107.conf 607.conf
Nope. 125.conf is not at another node
so try restoring a backup.
1- find backups
Code:
root@pve4:[~]:# locate 125.conf
/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/daily.0/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/daily.1/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/daily.2/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/daily.3/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/daily.4/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/daily.5/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/hourly.0/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.1/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.10/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.2/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.3/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.4/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.5/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.6/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.7/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.8/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/hourly.9/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
/rsnapshot-pve/monthly.0/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/monthly.1/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/monthly.2/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/weekly.0/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/weekly.1/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/weekly.2/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
/rsnapshot-pve/weekly.3/localhost/etc/pve/nodes/pve7/qemu-server/125.conf
the 'locate' database is out of sync. it updates every day usually.
Code:
/etc/pve/nodes/pve7/qemu-server/125.conf does not exist
this one is recent
Code:
root@pve4:[~]:# ll /rsnapshot-pve/daily.0/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
-rw-r----- 13 root www-data 429 Mar 23 19:35 /rsnapshot-pve/daily.0/localhost/etc/pve/nodes/pve4/qemu-server/125.conf
So cp the backup:
Code:
root@pve4:[~]:# cp /rsnapshot-pve/daily.0/localhost/etc/pve/nodes/pve4/qemu-server/125.conf /etc/pve/nodes/pve4/lxc/125.conf
cp: cannot create regular file '/etc/pve/nodes/pve4/lxc/125.conf': File exists
'File exists' ? yet it does not show up on ls :
Code:
root@pve4:[~]:# ls -l /etc/pve/nodes/pve4/lxc
total 2
-rw-r----- 1 root www-data 278 Mar 26 02:00 113.conf
-rw-r----- 1 root www-data 336 Mar 26 02:01 120.conf
-rw-r----- 1 root www-data 584 Mar 26 02:02 160.conf
-rw-r----- 1 root www-data 1225 Mar 26 02:03 606.conf