Owing to communication problems and a surprise revamp of our network, we have suffered from network loops, which have gradually overwhelmed any actual traffic, such that the nodes of our proxmox v7.1 cluster could probably not communicate neither with each other nor with the common nfs storage any more.
Now, although the network has fully recovered, and all of our 6 nodes can connect to each other via ssh and still have retained their configuration, they do not form a cluster any longer.
For each one of 4 of them, it looks like this:
The other 2 are still bound together, i.e.
Such that our cluster now looks like this:
Node A
=====
Node B
=====
Node C
=====
Node D
=====
Node E + Node F
Since none of those fragments is quorate, nothing can be done without force. Luckily, all of the VMs are still running, especially the DHCP and LDAP servers.
Is there any way to safely bring the cluster back together without a total overhaul?
Please let me know, if you require any more information.
My experience with proxmox is quite limited.
Now, although the network has fully recovered, and all of our 6 nodes can connect to each other via ssh and still have retained their configuration, they do not form a cluster any longer.
For each one of 4 of them, it looks like this:
Code:
% pvecm status ~
Cluster information
-------------------
Name: ITP
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Mar 2 12:32:05 2023
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.3c62
Quorate: No
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 1
Quorum: 4 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 XXX.XXX.XXX.XXX (local)
The other 2 are still bound together, i.e.
Code:
pvecm status
Cluster information
-------------------
Name: ITP
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Mar 2 12:39:18 2023
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000006
Ring ID: 3.3c76
Quorate: No
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 2
Quorum: 4 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000003 1 XXX.XXX.XXX.XXX
0x00000006 1 XXX.XXX.XXX.XXX (local)
Such that our cluster now looks like this:
Node A
=====
Node B
=====
Node C
=====
Node D
=====
Node E + Node F
Since none of those fragments is quorate, nothing can be done without force. Luckily, all of the VMs are still running, especially the DHCP and LDAP servers.
Is there any way to safely bring the cluster back together without a total overhaul?
Please let me know, if you require any more information.
My experience with proxmox is quite limited.
Last edited: