Network issues on kernel-6.8.12-28-pve causing cluster to fail

JSEHV

Active Member
Nov 11, 2020
21
11
43
After installing kernel-6.8.12-28-pve the networking has issues (already during boot). Reverting back to kernel-6.8.12-25-pve solved it. So it looks like there is a networking bug in the latest -28-pve kernel.

The boot takes a very long time to come up and the direct bond which is used for corosync (double wired, 10.x.x.x network) to stay in sync for this 3-node cluster is not operational (where all nodes say it is up, they can’t reach each other). Also the networking (single wired) over which each node has access to the rest of the network (192.168.x.x network) is flaky at best (it comes up, initially seems to work, but after a while vm’s are not reachable anymore).

I don’t know what is causing this and kernel updates have been proven solid, stable and without issues for years, this is the first time the cluster got issues due to a kernel update.

All other packages are up-to-date on this Proxmox VE 8 cluster, so it seems nothing to do with one of these packages.

Hopefully this can be picked up soon and other can be spared a non-functioning cluster, where vm’s won’t come up as quorum can’t be reached (due to the missing reachability of the other nodes)
 
Last edited: