Kubernetes overlay networking breaks when upgrading from PVE 9.1 to PVE 9.2.3

joelvdvoort

New Member
May 30, 2026
1
0
1
I have an odd problem since upgrading my Proxmox cluster from PVE 9.1 to PVE 9.2.3. It consists of three physical proxmox nodes joined together in a cluster. On proxmox I host 3 talos linux control planes and 3 workers. After the update I'm no longer able to reach anything over the pod and service cidr's when the workloads are hosted on a Talos node that's on a different physical proxmox node then the one I'm testing from. Once I migrate the talos nodes together on the same physical proxmox, then traffic on the overlay network starts working again. To rule out kubernetes or the cni's, I've installed different Talos clusters with Cilium and kube-proxy and without kube-proxy and even reinstalled it once with just the standard Flannel cni. To further rule out Talos I've installed a k3s cluster with the built-in flannel cni. Still the same, no traffic over the pod and service cidr's unless the nodes are on the same physical proxmox node.

So today I decided to reinstall proxmox from the 9.1 release iso and reinstalled the Talos nodes, ran my tests and traffic in the overlay network just works. To test even further I upgraded again to 9.2.3 and after that, overlay networking is broken again.

I'm pretty certain there's either an issue in the upgrade or 9.2.3 release. Can someone help me troubleshoot this further?
 
Last edited: