PVE 7 upgrade issue with ceph client (external) and openvswitch

gimpbully

Member
Aug 7, 2015
21
0
21
We've hit a pretty strange issue after doing a 6to7 upgrade on a system using openvswitch and a ceph rbd pool defined for an external ceph cluster.
Upon boot, for a brief second or two, the ceph storage is available but as soon as VMs start bringing up their vswitch ports, the ceph storage is inaccessible.
We've tried various versions of ceph but brought ourselves back down to 15.2.16-pve1. Everything else is up to date as of this posting. PVE 7.3-3, openvswitch 2.15.0+ds1-2+deb11u1.
We are able to regain access (at least to the web ui storage summary for the ceph pool) if we do a "systemctl restart networking" but obviously that tears down all the existing guest interfaces, rendering things pretty useless.

Kind of at a loss as to where to continue debugging here, beyond ripping openvswitch out and reverting to normal linux bridges - something I'd dearly like to avoid.
 
A note to folks who might find this ticket - this ended up being an MTU issue that seems to have been tickled by a more strict behavior on the upgraded software. Prior to the 7 upgrade, we were setting the underlying bond phy interfaces to 9k via this sort of syntax under the bond:

pre-up ( ifconfig eth2 mtu 9000 && ifconfig eth3 mtu 9000 )

I removed that and set each individual interface to 9k via:

auto eth2
iface eth2 inet manual
ovs_mtu 9000

auto eth3
iface eth3 inet manual
ovs_mtu 9000
Everything's happy now.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!