Hello,
so one of my PVE clusters got ugly and the common error on all nodes is that nothing can write to /etc/pve. I can read it but root or pve services cannnot write:
Jun 1 18:25:53 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Device or resource busy
Jun 1 18:50:20 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Transport endpoint is not connected
Jun 1 19:25:46 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Permission denied
Jun 1 18:51:27 node1 pmxcfs[22934]: [main] crit: fuse_mount error: Transport endpoint is not connected
and so on.
quorum looks ok:
Cluster information
-------------------
Name: clustername
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Tue Jun 1 19:33:24 2021
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1.246a
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.11.198.11 (local)
0x00000002 1 10.11.198.12
0x00000003 1 10.11.198.13
0x00000004 1 10.11.198.14
0x00000005 1 10.11.198.15
I found a possible solution on commitandquit forum where they suggest (if the quorum is ok) to:
Thank you. Will add any information needed.
so one of my PVE clusters got ugly and the common error on all nodes is that nothing can write to /etc/pve. I can read it but root or pve services cannnot write:
Jun 1 18:25:53 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Device or resource busy
Jun 1 18:50:20 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Transport endpoint is not connected
Jun 1 19:25:46 node3 pve-ha-lrm[3185]: unable to write lrm status file - unable to open file '/etc/pve/nodes/node3/lrm_status.tmp.3185' - Permission denied
Jun 1 18:51:27 node1 pmxcfs[22934]: [main] crit: fuse_mount error: Transport endpoint is not connected
and so on.
quorum looks ok:
Cluster information
-------------------
Name: clustername
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Tue Jun 1 19:33:24 2021
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1.246a
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.11.198.11 (local)
0x00000002 1 10.11.198.12
0x00000003 1 10.11.198.13
0x00000004 1 10.11.198.14
0x00000005 1 10.11.198.15
I found a possible solution on commitandquit forum where they suggest (if the quorum is ok) to:
- On every node do
systemctl stop pve-cluster
This may take a while - On every node do
sudo rm -f /var/lib/pve-cluster/.pmxcfs.lockfile - On each node – one by one do
systemctl start pve-cluster
Thank you. Will add any information needed.
Last edited: