Can't start nor destroy LXC container

bladux

Well-Known Member
Nov 7, 2016
30
0
46
42
Hi,

After a failed "restart" migration, now have an LXC container that refuses to start and that I can't destroy.

On the GUI, I do have the CT appearing on the correct node but the API seems to reply a 500 error code.
On the GUI any action gives "Error: Configuration file 'nodes/NODENAME/lxc/666.conf' does not exist (500)"

I looked in all cluster nodes and the 666.conf file does exist only in /etc/pve/nodes/NODENAME/lxc/ (the correct node).

I did some googling but can't find any hint..

Here is the migration error:
[...] Transfert progress [...]
96,636,764,160 100% 109.75MB/s 0:13:59 (xfr#1, to-chk=0/1)
feb. 10 12:13:47 start final cleanup
feb. 10 12:13:48 # /usr/bin/ssh -o 'BatchMode=yes' root@10.0.0.43 pct unlock 666
feb. 10 12:13:48 ERROR: failed to clear migrate lock: Configuration file 'nodes/NODENAME/lxc/666.conf' does not exist
feb. 10 12:13:48 start container on target node
feb. 10 12:13:48 # /usr/bin/ssh -o 'BatchMode=yes' root@10.0.0.43 pct start 666
feb. 10 12:13:49 Configuration file 'nodes/NODENAME/lxc/666.conf' does not exist
feb. 10 12:13:49 ERROR: command '/usr/bin/ssh -o 'BatchMode=yes' root@10.0.0.43 pct start 666' failed: exit code 255
feb. 10 12:13:49 ERROR: migration finished with problems (duration 00:14:19)
TASK ERROR: migration problems

Config files are in place on the correct node...
Is there anywhere else the CT config file should be found else than /etc/pve/nodes/NODENAME/lxc/ and /etc/pve/lxc/ ?

Is there a way to force proxmox to recreate its database ? It seems to be desynchronized since this migration...

Regards,
 
Finaly found a way to fix my issue: it seems the target node where the CT was migrated was correctly online but "alone" from its point of view.. It was the only entry in /etc/pve/.members

Fixed after restart of the node.