We are running PVE 7.3.3 with OVS and ifupdown1 (through upgrades from older PVE versions).
During apt-upgrade of the current OVS packages
the following error occurs:
...and in the same moment, the PVE server loses all its network connectivity. Causing instant trouble due to losing the SSH connection, and Ceph OSDs on that host having no more connections.
VMs on the affected server are still reachable via network, but stop to function as their Ceph storage is unresponsive.
Resorting to physical hands on in the server room, we find that the IP configuration was removed from all network interfaces.
Following the original error messages, we looked into the ovs-vswitchd.service, but to no avail. Instead
However, this shouldn't even happen in the first place.
We are now in the process of updating all our PVE servers with the physical-visit-to-the-server-room workaround. So we will not be able to easily reproduce this issue after today.
Though we are wondering: Are we the only ones experiencing this issue? Maybe an edge case due to our customized /etc/network/interfaces? Or is this a general issue you might want to be aware of?
Just in case of the latter, I'm writing this forum post.
regards,
Andreas
During apt-upgrade of the current OVS packages
openvswitch-common: 2.15.0+ds1-2+deb11u1 ==> 2.15.0+ds1-2+deb11u2
openvswitch-switch: 2.15.0+ds1-2+deb11u1 ==> 2.15.0+ds1-2+deb11u2
the following error occurs:
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.
...and in the same moment, the PVE server loses all its network connectivity. Causing instant trouble due to losing the SSH connection, and Ceph OSDs on that host having no more connections.
VMs on the affected server are still reachable via network, but stop to function as their Ceph storage is unresponsive.
Resorting to physical hands on in the server room, we find that the IP configuration was removed from all network interfaces.
Following the original error messages, we looked into the ovs-vswitchd.service, but to no avail. Instead
ifdown -a ; ifup -a
fixes the problem.However, this shouldn't even happen in the first place.
We are now in the process of updating all our PVE servers with the physical-visit-to-the-server-room workaround. So we will not be able to easily reproduce this issue after today.
Though we are wondering: Are we the only ones experiencing this issue? Maybe an edge case due to our customized /etc/network/interfaces? Or is this a general issue you might want to be aware of?
Just in case of the latter, I'm writing this forum post.
regards,
Andreas