Hello, I am seeing a particularly odd issue with pmxcfs.
There is a configuration file for an lxc that appears to exist in /etc/pve/nodes/[node]/lxc, but I cannot actually see it.
See here:
118.conf cannot be modified or its' contents accessed, but it has some reference preventing me from recreating/amending it.
This kills all HA functions from working, rendering my cluster dead.
I have tried restarting all nodes at the same time, restarting pve-ha-crm/lrm, moving this config file around to other nodes, and nothing seems to work. I have no idea what is wrong, but it appears to be a lower level issue with pmxcfs.
If someone could please advise me :'(
There is a configuration file for an lxc that appears to exist in /etc/pve/nodes/[node]/lxc, but I cannot actually see it.
See here:
Code:
root@hoopy:/etc/pve/nodes/hoopy# systemctl status pve-ha-crm
● pve-ha-crm.service - PVE Cluster HA Resource Manager Daemon
Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2022-10-07 12:41:45 EDT; 1 day 3h ago
Main PID: 1292 (pve-ha-crm)
Tasks: 1 (limit: 26282)
Memory: 65.6M
CPU: 25.235s
CGroup: /system.slice/pve-ha-crm.service
└─1292 pve-ha-crm
Oct 08 16:39:11 hoopy pve-ha-crm[1292]: recover service 'ct:118' to previous failed and fenced node 'gfman' again
Oct 08 16:39:11 hoopy pve-ha-crm[1292]: got unexpected error - Configuration file 'nodes/gfman/lxc/118.conf' does not exist
Oct 08 16:39:21 hoopy pve-ha-crm[1292]: recover service 'ct:118' to previous failed and fenced node 'gfman' again
Oct 08 16:39:21 hoopy pve-ha-crm[1292]: got unexpected error - Configuration file 'nodes/gfman/lxc/118.conf' does not exist
Oct 08 16:39:31 hoopy pve-ha-crm[1292]: recover service 'ct:118' to previous failed and fenced node 'gfman' again
Oct 08 16:39:31 hoopy pve-ha-crm[1292]: got unexpected error - Configuration file 'nodes/gfman/lxc/118.conf' does not exist
Oct 08 16:39:41 hoopy pve-ha-crm[1292]: recover service 'ct:118' to previous failed and fenced node 'gfman' again
Oct 08 16:39:41 hoopy pve-ha-crm[1292]: got unexpected error - Configuration file 'nodes/gfman/lxc/118.conf' does not exist
Oct 08 16:39:51 hoopy pve-ha-crm[1292]: recover service 'ct:118' to previous failed and fenced node 'gfman' again
Oct 08 16:39:51 hoopy pve-ha-crm[1292]: got unexpected error - Configuration file 'nodes/gfman/lxc/118.conf' does not exist
root@hoopy:/etc/pve/nodes/gfman/lxc# touch 118.conf
touch: cannot touch '118.conf': File exists
root@hoopy:/etc/pve/nodes/gfman/lxc# cat 118.conf
cat: 118.conf: No such file or directory
root@hoopy:/etc/pve/nodes/gfman/lxc# ls
108.conf 127.conf 128.conf
118.conf cannot be modified or its' contents accessed, but it has some reference preventing me from recreating/amending it.
This kills all HA functions from working, rendering my cluster dead.
I have tried restarting all nodes at the same time, restarting pve-ha-crm/lrm, moving this config file around to other nodes, and nothing seems to work. I have no idea what is wrong, but it appears to be a lower level issue with pmxcfs.
If someone could please advise me :'(