Update documentation

freebee

Member
May 8, 2020
62
2
13
40
Hi.
Here in the documentation of a HA Cluster (https://pve.proxmox.com/wiki/Cluster_Manager) in "Remove a Cluster Node" we have the command:
pvecm delnode hp4
The tutorial says to turnoff the server before remove from node, however he give a error:

pvecm delnode MIRROR-USA-VIR-01-PV01
Killing node 4
Could not kill node (error = CS_ERR_NOT_EXIST)
error during cfs-locked 'file-corosync_conf' operation: command 'corosync-cfgtool -k 4' failed: exit code 1

When look in other comments, i look at /etc/pve/members and corosync.conf and nothing wrong there, so, the problem was with status stuck on gui.
The solution is: systemctl stop pve-ha-crm.service && rm -f /etc/pve/ha/manager_status && systemctl start pve-ha-crm.service

In the DOC is important add 'workarounds for remove node errors':
If the command pvecm delnode give a error and when try again the same command returns a: error during cfs-locked 'file-corosync_conf' operation: Node/IP: hp4 is not a known host of the cluster, verify /etc/pve/.members and /etc/pve/corosync.conf if has the removed node name (hp4 in this case). If not found in this files, execute the command ha-manager status. If "unable to read file '/etc/pve/nodes/hp4/lrm_status'" is returned, execute on each server/node:
systemctl stop pve-ha-crm.service && rm -f /etc/pve/ha/manager_status && systemctl start pve-ha-crm.service
This will fix the GUI.
 
Last edited:
pvecm delnode MIRROR-USA-VIR-01-PV01
Killing node 4
Could not kill node (error = CS_ERR_NOT_EXIST)
error during cfs-locked 'file-corosync_conf' operation: command 'corosync-cfgtool -k 4' failed: exit code 1
That normally rather means that the node you removed was already dropped from corosync?

The solution is: systemctl stop pve-ha-crm.service && rm -f /etc/pve/ha/manager_status && systemctl start pve-ha-crm.service
HA has nothing directly to do with corosync, so I'd like to know what the original problem is which that solution solves?
 
I have a similar issue. I have a dead node, it shows in the GUI, but I cannot delete it because pvecm nodes doesn't even list it, and pvecm delnode proxmox1 throws:

Code:
Killing node 1
Could not kill node (error = CS_ERR_NOT_EXIST)
error during cfs-locked 'file-corosync_conf' operation: command 'corosync-cfgtool -k 1' failed: exit code 1

There seems to be a bug in the documentation or proxmox here maybe? I can't believe removing a failed node isn't covered by either.

Edit: oh it is a bug in both the documentation and pve: https://forum.proxmox.com/threads/feedback-on-admin-guide-removing-node-from-cluster.87112/
 
Last edited:
Do you have replication jobs? And it's only an issue for the docs, do not see how this is a bug in Proxmox VE?

I have a similar issue. I have a dead node, it shows in the GUI,
In general, the removed node also shows up, if there are left-over guest configurations still in /etc/pve/nodes/NAME/ from the remaining nodes. As then, the web-interface gets those VMIDs and has to map them somewhere, the most fitting place is the node name it was on.

So for that case you can delete the VMIDs configurations in there, after ensuring you do not need them any more.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!