Hi,
I have a cluster with 5 servers, running PVE 6 up to date.
After adding the 6th server, I have an issue where the last node #6 disappear from the cluster.
Here is the output of
On one node of the running cluster:
On the missing node:
As you can see, the "missing node" still think everything is Ok. But on all other nodes, they all have this error: Unable to get node address for nodeid 6: CS_ERR_NOT_EXIST
If I reboot the node, it's correctly visible again to the cluster, but if I wait from some hours to a few days, it fails again.
What can be the root cause of this?
Thanks for help
I have a cluster with 5 servers, running PVE 6 up to date.
After adding the 6th server, I have an issue where the last node #6 disappear from the cluster.
Here is the output of
pvecm status
On one node of the running cluster:
Code:
Cluster information
-------------------
Name: pve-emcp
Config Version: 10
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Apr 30 14:55:09 2020
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000003
Ring ID: 1.41d
Quorate: Yes
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 6
Quorum: 4
Flags: Quorate
Unable to get node address for nodeid 6: CS_ERR_NOT_EXIST
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.10.106
0x00000002 1 172.16.10.104
0x00000003 1 172.16.10.102 (local)
0x00000004 1 172.16.10.105
0x00000005 1 172.16.10.103
0x00000006 1
On the missing node:
Code:
Cluster information
-------------------
Name: pve-emcp
Config Version: 10
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Apr 30 14:55:30 2020
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000006
Ring ID: 1.41d
Quorate: Yes
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 6
Quorum: 4
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.10.106
0x00000002 1 172.16.10.104
0x00000003 1 172.16.10.102
0x00000004 1 172.16.10.105
0x00000005 1 172.16.10.103
0x00000006 1 172.16.10.101 (local)
As you can see, the "missing node" still think everything is Ok. But on all other nodes, they all have this error: Unable to get node address for nodeid 6: CS_ERR_NOT_EXIST
If I reboot the node, it's correctly visible again to the cluster, but if I wait from some hours to a few days, it fails again.
What can be the root cause of this?
Thanks for help