corosync modify jumbo problem

gz_jax

New Member
Mar 18, 2020
6
1
3
35
The mtu of the current pve server cluster is 1500 by default. We want to change the mtu of the pve cluster to 9000. We migrate the virtual machines one by one in the test environment to modify the mtu. The cluster and ceph are all normal, but in the production environment we found the node after modification. Corosync will be kicked out of the cluster, but the network can communicate normally, including the ceph cluster, and we can change it back to the cluster after the mtu is 1500. The difference between the test environment and the production environment is different in hardware and quantity. Others are basically the same. For example, the switch used in the test environment is an RJ45 network, the pve management and ceph share a bonding port, the production environment is fiber, and the pve management and ceph bonding port are separate. How can we troubleshoot this problem?
 
Hi,

We do not recommend to use the same network for ceph and corosync.
If you have this in combination with HA it will interrupt your services.
The problem is if you have too much traffic on the ceph network, corosync is not able to get the needed low latency.
This is a problem for the quorum that is needed for a stable working cluster.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!