Node does not completely disappear from cluster

kenneth_vkd

Well-Known Member
Sep 13, 2017
37
3
48
31
Hi
We have now started the process of replacing some nodes in our cluster. The cluster runs Ceph + HA.

What we have done so far is "out" each OSD and let the cluster rebalance after each OSD. Then stopped the OSD and deleted MGR, MON and MDS roles from the node to remove.
Then we rebooted the node into a state where PVE was no longer running as per the documentation for removing a node. We then removed the node from the HA group that we had created and finally executed the "pvecm delnode <node name>" from one of the remaining nodes.
At this point in time the node is gone from the server list in the GUI, but it is still shown in Ceph under the list of OSDs as a node, but no OSDs are defined on it
Under HA, it is shown like this:

Code:
lrm nodeXXXX (old timestamp - dead?, Tue Oct 19 16:05:31 2021)

We have tried to pve-ha-crm.service on all remaining nodes. However, this yielded no difference.
Will the node drop out by it self from here or should it have been removed from the HA group prior to rebooting it into a unusable state? If it should have been removed prior to rebooting it, how can we fix it now?


UPDATE: Seems that it just took some time before HA removed the node, but it is still shown in Ceph OSD list.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!