OpenvSwitch Failing VE4.4

Marvin

Renowned Member
Jan 20, 2011
40
1
73
Upgraded to VE 4.4 with subscription. System was working fine before. Now openvswitch is failing.
1. vmbr0 is popping in and out of "promiscuous mode" (as seen in /var/log/syslog) causing intermittent connection to console and web gui. I can only get to it now from server's KVM.
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Request timed out.
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Request timed out.
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Request timed out.
Request timed out.
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Request timed out.
Request timed out.
Request timed out.
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Reply from 10.94.100.230: bytes=32 time<1ms TTL=63
Request timed out.​
2. Several internal vm networks do not exists, vm's fail to start because missing vmbrX's.
3. /etc/network/interfaces file is same as before.
4. ovs-vsctl show does show all the vmbrX interfaces as expected.
5. ovs-vsctl show ovs_version: "2.6.0"
6. /var/log/syslog also shows reoccurring "/etc/openvswitch/conf.db does not exist" -- followed by "creating empty database" every 5 seconds.
7. dpkg -l shows openvswtich-switch and -common 2.6.0-2

This is our production server and i cannot bring it back up! I do not know what else to do, please help.
 
Here's something interesting. There are several vm's that also use the same vmbrX interface to connect to the network. If i start the vm's, connection to them is solid. It appear only to be intermittent connecting to the proxmox console interface (and web gui).

Edit: if i log into one of the vm's using the same network interface (same vmbrX), it's connection to the Proxmox console (and GUI) shows the same intermittent connection. So the vm's are fine connected to thru openvswitch. It seems it's only the Proxmox console that has the issue.

Edit: If i ping from the Proxmox console to a vm, the rest of the world looses contact with that vm.

Edit: It's now midnight here. I am finding that vm's that seem to work for a while will suddenly loose all connection. then mysteriously come back. I am very tired and frustrated. Deciding to upgrade from 4.2 to 4.4 was not a good choice. Now nothing works and i will be in trouble in the morning. Is there any help to be had?
 
Last edited:
please post the complete log, output of "pveversion -v", network configuration and guest configuration files.
 
please post the complete log, output of "pveversion -v", network configuration and guest configuration files.
When you say complete log, do you mean the syslog file? It is currently almost a gig! I tried to gzip but not enough room on the disk.
 
all except the complete log (because i'm nor sure which you mean).
 

Attachments

  • interfaces.txt
    1.8 KB · Views: 4
  • pveversion.txt
    744 bytes · Views: 1
  • 100conf.txt
    235 bytes · Views: 0
  • 101conf.txt
    278 bytes · Views: 0
  • 102conf.txt
    223 bytes · Views: 0
  • 103conf.txt
    215 bytes · Views: 0
  • 104conf.txt
    223 bytes · Views: 0
  • 112conf.txt
    292 bytes · Views: 0
  • 111conf.txt
    347 bytes · Views: 0
  • 105conf.txt
    224 bytes · Views: 0
a few more.
 

Attachments

  • 113conf.txt
    463 bytes · Views: 0
  • 121conf.txt
    372 bytes · Views: 0
  • 122conf.txt
    392 bytes · Views: 0
  • 123conf.txt
    440 bytes · Views: 0
  • 124conf.txt
    350 bytes · Views: 0
  • 126conf.txt
    436 bytes · Views: 0
  • 130conf.txt
    249 bytes · Views: 0
  • 131conf.txt
    324 bytes · Views: 0
When you say complete log, do you mean the syslog file? It is currently almost a gig! I tried to gzip but not enough room on the disk.

I think the part from before the upgrade starting until a little bit into the problems should be enough.
 
also, I remember user reporting issues when upgrading to OVS 2.4, that where solved by purging the package and re-installing it, so you could give this a try (note: over the iKVM console or similar, not over the network). note that this will remove the log files from /var/log/openvswitch, so it makes sense to back them up first.

Code:
# mkdir ovs-tmp
# apt-get download openvswitch-switch
# cp -v /var/cache/apt/archives/openvswitch-switch*.deb ovs-tmp
‘/var/cache/apt/archives/openvswitch-switch_2.6.0-2_amd64.deb’ -> ‘ovs-tmp/openvswitch-switch_2.6.0-2_amd64.deb’
# apt-get purge openvswitch-switch
# dpkg -i ovs-tmp/openvswitch-switch_2.6.0-2_amd64.deb
 
Yes i had seen the post where that was recommended and i tried that. But i can do it again...
 
I just noticed the following in your network configuration, which seems very wrong to me:
Code:
...

auto vmbr0
iface vmbr0 inet static
    address  10.94.100.230
    netmask  255.255.255.0
    gateway  10.94.100.1
    ovs_type OVSBridge
    ovs_ports eth0 
    bridge_ports eth0
    bridge_stp off

...

notice how you are mixing linux bridge and OVS bridge parameters there? I think this should be

Code:
auto vmbr0
iface vmbr0 inet static
    address  10.94.100.230
    netmask  255.255.255.0
    gateway  10.94.100.1
    ovs_type OVSBridge
    ovs_ports eth0

instead.. (needs a reboot to apply)
 
  • Like
Reactions: Jason Hamm
YES!! The linux bridge parameters in the OVS parameters was it! It is now running 100%!

Thank you Fabian, you saved my sanity. It is 3:30am here and now i can go home and go to sleep!
 
  • Like
Reactions: Jason Hamm

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!