Open vSwitch updates

zeha

Well-Known Member
Oct 11, 2016
32
4
48
Apparently, updating the openvswitch packages cause openvswitch to restart, thereby losing its configuration (and then the cluster watchdog kills the box).

For now, I'd suggest to install a policy-rc.d file blacklisting openvswitch-switch until this is resolved.
Or are there better solutions?

-c
 
@zeha: normally the open vswitch restart lasts a couple of seconds which means the network is back fast enough so the watchdog will not be triggered. It turned out that in some configurations the restart takes longer that expected (timeout 60s) and the software watchdog triggers a reboot.

BTW any reason to here open-vswitch ? Apart if you want to access the bridge via an API to do software defined networking, the standard linux bridges have the same features as open-vswitch IIRC.
You could also protect yourself and add also level of redundacy to your cluster network by adding a redundant communication network for the cluster, running on a separte hardware NIC: see http://pve.proxmox.com/pve-docs/chapter-pvecm.html#_redundant_ring_protocol

Also daemon restarts are a normal process as part of a system upgrade, all pve services are routinely restarted as part of normal of upgrades.
 
@manu: "the standard linux bridges have the same features as open-vswitch"

Really? So a single linux bridge these days can support multiple vlans then just assign a vm to one of those vlans without requiring a bridge per vlan? When did this happen?

Also, Rapid Spanning Tree is supported on Linux bridges with proxmox too?

Can you provide the links to the docs on how to do this?

Also, I don't believe @zeha or @baggar11 were saying the watchdog was triggered but rather networking never came up at all after the package upgrade, that they had to manually restart to get networking back.
 
@brad I didn't know about RSTP thanks for pointing about this
Concerning the node reboot mentionned by the OP: if the network does not come up on the interface you use for cluster communication, and if the node has HA resources, then the local software watchdog will reboot the node, to try to restore it to a working state.

this is has happended recently
https://forum.proxmox.com/threads/patching-pve-4-3-on-one-node-made-hole-cluster-reboot.30812/
but we couldn't yet reproduce internally the problem of open vswitch taking a long time
 
@manu

In my case, the OVS service was up and running, but networking was unresponsive on Host2 only. I tried restarting the OVS service without luck. This isn't the first time an OVS update has killed networking in my environment, which isn't that elaborate. Oddly enough, only Host1 upgraded successfully without needing a restart to get networking back up.

Host1 = Atom c2550 w/ vmbr0 and vmbr1
Host2 = Atom c2758 w/ vmbr0 and vmbr1
 
I can confirm this happened to me as well, the 'needrestart' package did not fix the situation.
 
@manu: "the standard linux bridges have the same features as open-vswitch"

Really? So a single linux bridge these days can support multiple vlans then just assign a vm to one of those vlans without requiring a bridge per vlan? When did this happen?

since kernel 4.1. (and proxmox 4)

just go to proxmox network gui, on the bridge, and check "vlan aware" chexbox.
 
@spirit Wow, that's cool that standard linux bridges support that now. Any idea what gets written to /etc/network/interfaces for this (I never use the GUI for configuring the interfaces)? I saw cumulus linux supported something like this without needing OVS, but figured it might be something proprietary. Any idea if this more or less efficient than OVS?

Also, any idea if the mstpd package (for providing things like RSTP, MSTP for standard linux bridges - https://github.com/mstpd/mstpd) is going to be added to Proxmox?

I think between those 2 items, assuming there's no huge efficiency differences, we could probably drop OVS in our deployments.
 
@spirit Wow, that's cool that standard linux bridges support that now. Any idea what gets written to /etc/network/interfaces for this (I never use the GUI for configuring the interfaces)? I saw cumulus linux supported something like this without needing OVS, but figured it might be something proprietary. Any idea if this more or less efficient than OVS?
Indeed, this is cumulus who has done the kernel implementation of bridge vlan.
we have implemented the same syntax than cumulus for /etc/network/intefaces

auto vmbrX
bridge_vlan_aware yes

I'm running it in production since 1 year without any problem
 
@spirit
Also, any idea if the mstpd package (for providing things like RSTP, MSTP for standard linux bridges - https://github.com/mstpd/mstpd) is going to be added to Proxmox?

I think between those 2 items, assuming there's no huge efficiency differences, we could probably drop OVS in our deployments.

maybe can you try the deb packages from cumulus repo ? (they are debian jessie based, so it should work)

http://repo3.cumulusnetworks.com/repo/pool/cumulus/m/mstpd/
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!