Removing node from cluster does not remove it from gui.

TobiasE

New Member
Dec 6, 2017
24
1
1
45
Hello
After removing a node from my proxmoxcluster, it still shows up in my cluster with a white x on it.
I tried restarting corosync and pveproxy as well as the filesystem.
I see that there is still an entry for the deleted node in /etc/pve/corosync,conf, however I am unable to edit this file (readonly). Is this a known bug?
I used
pvecm delnode <node> to delete the node to delete it
 
Last edited:
  • Like
Reactions: Adithya
Hi,

I tried restarting corosync and pveproxy as well as the filesystem.
I see that there is still an entry for the deleted node in /etc/pve/corosync,conf, however I am unable to edit this file (readonly). Is this a known bug?

No known bug, the removal had not completed correctly though.

Do you had a 2 node setup where you removed one?

Can you run:
Code:
pvecm expected 1

To tell the other node it's fine with just one node, and then rerun the 'pvecm delnode NODE' command?
 
  • Like
Reactions: wingo
  • Like
Reactions: Adithya
No it does not appear when i run pvecm nodes on a node which is still in the cluster, but in the web gui it still appears.
 
I am experiencing the same issue on PVE 5.3-5.
I've manually decremented number of expected nodes (since it didn't automatically decremented when I removed a node from the cluster), but the GUI 'still remembers' the deleted node.

Code:
pvecm status

Quorum information
------------------
Date:             Wed Dec 19 12:03:01 2018
Quorum provider:  corosync_votequorum
Nodes:            4
Node ID:          0x00000002
Ring ID:          2/72
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      4
Quorum:           3 
Flags:            Quorate 

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 172.2.2.3 (local)
0x00000005          1 172.2.2.6
0x00000001          1 172.2.2.2
0x00000003          1 172.2.2.4
 
since it didn't automatically decremented when I removed a node from the cluster

that probably means that the removed node is still online and still means it's in the cluster, I'd guess.
the GUI 'still remembers' the deleted node

Does the deleted node still has configuration files in it's /etc/pve/nodes/NODENAME/ lxc and qemu-server folders? Because if, then the web gui creates a node in the resource tree no matter what, as it needs to map VM and CT configs somewhere. If only that's the case delete them and reload.

when I removed a node
how did you do this exactly? Did you follow:
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_remove_a_cluster_node
or a section below, without purging the removed node:
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_separate_node_without_reinstall
 
that probably means that the removed node is still online and still means it's in the cluster, I'd guess.
Yes, the node was still online (I thought stopping pve services would be enough).
Just ifdown'ed LAN interface (the nodes were addressing each other by LAN IPs), the issue is still there.

Does the deleted node still has configuration files in it's /etc/pve/nodes/NODENAME/ lxc and qemu-server folders? Because if, then the web gui creates a node in the resource tree no matter what, as it needs to map VM and CT configs somewhere. If only that's the case delete them and reload.
No, those dirs are empty on the removed node and there's no /etc/pve/nodes/REMOVEDNODENAME/ on the online nodes of the cluster.

I probably did all those actions in a wrong order: I removed the node via `pvecm delnode` BEFORE I switched off the removed node's network.
How do I unscrew things now?
 
Oh, god, I screwed up big time:
I'm experiencing an issue like the one described here: https://forum.proxmox.com/threads/gui-node-list-empty-during-vm-creation-migration.42739/
and `pvesh get /nodes` returns `unable to read '/etc/pve/nodes/d0217/pve-ssl.pem' - No such file or directory`, where d0217 is the name of the removed node.
Afair, I removed that dir manually, `pvecm status` and `pvecm nodes` don't list d0217 anymore.
What do I do?
 
FTH: the issue got resolved via `sudo systemctl restart corosync && sudo systemctl restart pve-cluster` on each alive node of the cluster.
Thank you!
 
Just for information.

I had the same problem and in addition to all of the above I had to edit manually the /etc/pve/corosync.conf file.
I removed the node-entry and increased the value of config_version before restarting corosync and pve-cluster.
After all of that everything worked fine.

BUT, this is not the expected behaviour since I sticked absolutely to the officially described way of removing the node!
Mhhh....
 
BUT, this is not the expected behaviour since I sticked absolutely to the officially described way of removing the node!
Mhhh....
It surely isn't we try to remove it from the configuration file actively, if anything is off with the cluster (e.g., not quorate without the "to-be-deleted" node) then this may fail.
The 'pvecm delnode' command should show errors then, though.
 
Hi,



No known bug, the removal had not completed correctly though.

Do you had a 2 node setup where you removed one?

Can you run:
Code:
pvecm expected 1

To tell the other node it's fine with just one node, and then rerun the 'pvecm delnode NODE' command?
Thanks so much! I was going crazy with this. This worked !!!!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!