VM network crash after openvswitch-switch upgrade

clement.lagrange

New Member
Apr 1, 2022
2
0
1
France
Hello.
I had a serious problem last Sunday when the package openvswitch-switch was upgraded and all the VMs were out of network, until I thought of doing ifup lan0 on each hosts.

After setting up a test hypervisor, and many tests, I tracked down the issue to :
  • systemctl restart ovs-vswitchd crash vm networks, but ifreload -a get it up again
  • systemctl restart networking crash vm networks, but ifreload -a is not enough. I need to change a parameter on the vm config (like firewall=1) and back to get network back.
Config on the test host is pretty straight forward. Packages up-to-date.

Code:
cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto lan0
iface lan0 inet static
    address 10.1.253.1/24
    ovs_type OVSIntPort
    ovs_bridge vmbr1

auto vmbr0
iface vmbr0 inet static
    address 172.30.1.3/12
    gateway 172.16.0.1
    bridge-ports enp1s0
    bridge-stp off
    bridge-fd 0

auto vmbr1
iface vmbr1 inet manual
    ovs_type OVSBridge
    ovs_ports lan0
#bappli_tinc

Would there be some way of adding ifreload -a to the ovs-vswitchd service restart or something ?

And it would be nice to have a command to re-attach all the VMs to the bridge, instead of changing arbitrary parameters through the GUI once for each VM...

I'm surprise such a bug would creep on me and not impact others.

Best regards.
 
Woops, I swear I did search, though mostly through google, the recent threads might not be indexed yet...

My luck was that I use OVS only for a separate VM bridge, so I did not loose access to the nodes.

I did not see the problem described as here, for me the trouble boils down to the ovs upgrade process doing a systemctl restart ovs-vswitchd, without a ifreload -a or similar.

I hope one of the threads get some answer.

Regards.
 
systemctl restart networking crash vm networks,

this is expected, networking restart should never be used (and that's why ifreload -a is used by proxmox gui), as the vm nic mapping to vmbrX is not managed by /etc/network/interfaces.
(so it's restart vmbrX , but vm tap interfces are not attached anymore)
That's why you need to use the gui (like enable/disable firewall), to replug interface again.


Would there be some way of adding ifreload -a to the ovs-vswitchd service restart or something ?
ovs package is the debian package, maybe proxmox team should need to manage it again in the proxmox repo, to avoid this or be able to add some fix.
 
I'm surprise such a bug would creep on me and not impact others.

Best regards.

oh, I've been working around this since 6.x days with out of band RMI/iKVM connections that I run the `apt [dist-]upgrade` in a screen/tmux/byobu session after pre-vacuuming the hypervisor, as I know that the VM etc. interfaces will also by "lost" after the OpenVSwitch restart

It's especially "painfull" for me on OVH where the interfaces are BONDed and the only network connection to the OpenVSwitch bridge, where the rest of the hypervisor and VMs (each in different VLANs) are connected.
I'll retry with the ifreload with my 7.1->7.2 upgrades, but I believe the issue is also the VM/LXC network interfaces that needs to be re-established :(
 
this is expected, networking restart should never be used (and that's why ifreload -a is used by proxmox gui), as the vm nic mapping to vmbrX is not managed by /etc/network/interfaces.
(so it's restart vmbrX , but vm tap interfces are not attached anymore)
That's why you need to use the gui (like enable/disable firewall), to replug interface again.

well... that replug via GUI is just.... not feasible when you have tens of VMs/LXCs to do.
Where is that `ifreload` documented?
I wished it was something part of the upgrade documentation/etc. or at least triggered with a warning whenever OpenVSwitch is involved
 
  • Like
Reactions: clement.lagrange

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!