I had a server die this past week and I removed it from the infrastructure. I can confirm that the server is out of commission and is pulled from the rack. I removed the node and all is well, or so I thought. The cluster is giving me an issue with regards to it seeming not being able to find the node.
I had 5 nodes, now down to 4.
All 4 nodes (vdc2-vdc5) are showing as proper.
When I run a pvecm status, I get the following:
Cluster information
-------------------
Name: vdc-cluster
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri Feb 14 23:47:51 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000002
Ring ID: 2.dc
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.16.1.3 (local)
0x00000003 1 172.16.1.4
0x00000004 1 172.16.1.5
0x00000005 1 172.16.1.6
So all looks right to me. I see that the node is done and its gone from the GUI as a server. I don't see any reference to it except in the /etc/pve/corosync.conf file at the bottom where there is still the old entry on the cluster for the removed node in the bindnetaddr section:
totem {
cluster_name: vdc-cluster
config_version: 6
interface {
bindnetaddr: 172.16.1.2
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
If I do into /etc/pve/nodes I also see an old folder for vdc1 which has been removed.
I'd like to know how to best resolve this issue to get that error to go away.
Can I simple delete /etc/pve/nodes/vdc1 and change the bindnetaddr IP address to one of the other nodes that are still alive?
Thank you very much for any help.
I had 5 nodes, now down to 4.
All 4 nodes (vdc2-vdc5) are showing as proper.
When I run a pvecm status, I get the following:
Cluster information
-------------------
Name: vdc-cluster
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri Feb 14 23:47:51 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000002
Ring ID: 2.dc
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.16.1.3 (local)
0x00000003 1 172.16.1.4
0x00000004 1 172.16.1.5
0x00000005 1 172.16.1.6
So all looks right to me. I see that the node is done and its gone from the GUI as a server. I don't see any reference to it except in the /etc/pve/corosync.conf file at the bottom where there is still the old entry on the cluster for the removed node in the bindnetaddr section:
totem {
cluster_name: vdc-cluster
config_version: 6
interface {
bindnetaddr: 172.16.1.2
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
If I do into /etc/pve/nodes I also see an old folder for vdc1 which has been removed.
I'd like to know how to best resolve this issue to get that error to go away.
Can I simple delete /etc/pve/nodes/vdc1 and change the bindnetaddr IP address to one of the other nodes that are still alive?
Thank you very much for any help.