Split temporarily cluster in 2 clusters due to network instability

Dec 27, 2019
5
0
6
32
Hi,
I upgraded Proxmox 5.4 to 6.1
With Proxomx 5.4 I was able to manage in a single cluster 27 nodes. After the upgrade only 14 nodes are able to coexist in the same cluster.
For now, I can use only a single link, due to the lack of additional ports on the switch. New switches will come in 2 months.
Is it possible to split temporarily nodes into 2 clusters considering that each node has at least one VM on it?
After the arrive of new switches they will be rejoined.

Thanks in advance

Antrax
 
Hi,
I upgraded Proxmox 5.4 to 6.1
With Proxomx 5.4 I was able to manage in a single cluster 27 nodes. After the upgrade only 14 nodes are able to coexist in the same cluster.
For now, I can use only a single link, due to the lack of additional ports on the switch. New switches will come in 2 months.
Is it possible to split temporarily nodes into 2 clusters considering that each node has at least one VM on it?
After the arrive of new switches they will be rejoined.

Thanks in advance

Antrax

You cant Join a node to a cluster that has VMs on it, so your have issue with the rejoining part.

What issues are you having with the current cluster that only 14 are connected at once? Errors on corosync e.t.c
 
You cant Join a node to a cluster that has VMs on it, so your have issue with the rejoining part.

What issues are you having with the current cluster that only 14 are connected at once? Errors on corosync e.t.c
With 14 nodes everything work properly; after that I turn on the 15th node (and so on), apparently also the previous 14 nodes stop to be connected in the cluster, the quorum get lost and so on. I guess that UDP causes overload on the net causing an increasing of the traffic.

For the moment SCTP seems to work better (thanks to @Shturman post https://forum.proxmox.com/threads/a...-after-upgrade-5-4-to-6-0-4.56425/post-260570), even if not perfectly. I find out that corosync 3.0.3 introduces some important fixes (https://github.com/kronosnet/kronosnet/issues/275#issuecomment-559062236)

With UDP the issues are described here https://forum.proxmox.com/threads/max-cluster-nodes-with-pve6.55277/post-285280
 
Hi,
I upgraded Proxmox 5.4 to 6.1
With Proxomx 5.4 I was able to manage in a single cluster 27 nodes. After the upgrade only 14 nodes are able to coexist in the same cluster.
For now, I can use only a single link, due to the lack of additional ports on the switch. New switches will come in 2 months.
Is it possible to split temporarily nodes into 2 clusters considering that each node has at least one VM on it?
After the arrive of new switches they will be rejoined.

Thanks in advance

Antrax
It seems to me that the new corosync using unicast is more finicky than the old one that used multicast. We havea 3 node cluster and we had corosync related issues after upgrading to PVE 6. What we did is we split the management network (4x 1gbit links) into 2 2x1gbit, one for management and the other to dedicated cluster traffic. No issues since.
 
It seems to me that the new corosync using unicast is more finicky than the old one that used multicast. We havea 3 node cluster and we had corosync related issues after upgrading to PVE 6. What we did is we split the management network (4x 1gbit links) into 2 2x1gbit, one for management and the other to dedicated cluster traffic. No issues since.
The problem is that I don't have free ports on the switch to connect other cables; I've ordered another switch but there is a huge delay about the delivery of it.

Would it be possible to unjoin nodes from the cluster and later join them again to the same cluster (ensuring that the node that is going to be rejoint doesn't have any vm/ct)?
 
Last edited:
The problem is that I don't have free ports on the switch to connect other cables; I've ordered another switch but there is a huge delay about the delivery of it.

Would it be possible to unjoin nodes from the cluster and later join them again to the same cluster (ensuring that the node that is going to be rejoint doesn't have any vm/ct)?
1. You would have to reinstall them ideally. Strictly speaking it is possible to remove and readd them but you would have to make really sure stuff is
- cleaned thoroughly from the distributed cluster config
- not stuck in the cluster config
- not cached on the node.
I don't know for sure but you might break your cluster if you don't do this right.
2. I don't think it is specifically related to specific nodes. So rejoining them might not really help.
 
Proxmox wont let you join a node that has VM's on it already.

So yes technically you can breakup the cluster, however you would then have to backup every VM remove it off the node before you could add the node to the cluster and then restore the VM.
 
Proxmox wont let you join a node that has VM's on it already.

So yes technically you can breakup the cluster, however you would then have to backup every VM remove it off the node before you could add the node to the cluster and then restore the VM.

you can trick it, move vm && ct config from /etc/pve to /tmp for example.
join the cluster
move config file back again.


Also, if you split your cluster, be carefull when creating new vm, don't create vms when same id on both clusters.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!