Openvswitch Crash

dazza76

Renowned Member
May 25, 2010
41
0
71
Hello all,
I have built a test cluster using 4 machines and OVS
I have used the example config from https://pve.proxmox.com/wiki/Open_vSwitch with bonding .

What i am seeing is when I add or remove a port/vlan (i.e start a container) the OVS Seg faults

i have debugged it down to the
/usr/bin/ovs-vsctl del-port veth101.0
/usr/bin/ovs-vsctl add-port vmbr0 veth101.0 tag=10

015-04-23T03:27:53.790Z|05941|netdev_linux|WARN|veth101.2: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:27:53.790Z|05942|netdev_linux|WARN|veth101.1: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:27:53.791Z|05943|netdev_linux|WARN|veth101.0: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:28:07.851Z|05944|bridge|WARN|could not open network device veth101.2 (No such device)
2015-04-23T03:28:07.853Z|05945|bridge|WARN|could not open network device veth101.0 (No such device)
2015-04-23T03:28:07.854Z|05946|bridge|INFO|bridge vmbr0: using datapath ID 00003440b58020c4
2015-04-23T03:28:07.879Z|05947|bond|INFO|interface eth1: link state down
2015-04-23T03:28:07.879Z|05948|bond|INFO|interface eth1: disabled
2015-04-23T03:28:08.798Z|05949|bond|INFO|interface eth0: link state down
2015-04-23T03:28:08.798Z|05950|bond|INFO|interface eth0: disabled
2015-04-23T03:28:08.798Z|05951|bond|INFO|bond bond0: all interfaces disabled
2015-04-23T03:28:08.964Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 1966 died, killed (Segmentation fault), core dumped, restarting
2015-04-23T03:28:08.966Z|00003|memory|INFO|6688 kB peak resident set size after 7588.4 seconds
2015-04-23T03:28:08.966Z|00004|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2015-04-23T03:28:08.966Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2015-04-23T03:28:09.064Z|00006|ofproto_dpif|INFO|system@ovs-system: Datapath supports recirculation
2015-04-23T03:28:09.064Z|00007|dpif|WARN|system@ovs-system: execute userspace(pid=0,userdata(00000000)) failed (Invalid argument) on packet metadata=0,in_port=0,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x1234
2015-04-23T03:28:09.064Z|00008|ofproto_dpif|WARN|system@ovs-system: variable-length userdata feature probe failed (Invalid argument)
2015-04-23T03:28:09.064Z|00009|dpif|WARN|system@ovs-system: failed to put[create] (Invalid argument) skb_priority(0),skb_mark(0),in_port(0),eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x8847),mpls(label=0,tc=0,ttl=0,bos=1)
2015-04-23T03:28:09.064Z|00010|ofproto_dpif|INFO|system@ovs-system: MPLS label stack length probed as 0


it takes 30-70 seconds for this to come back up.
Any ideas on how i can resolve this ,

I have tried downgrading from 2.3.1-1 to 2.0.90-4 with no luck

ATM I have had to switch LACP off

Cheers
 
Last edited:
Reviving a zombie thread i know, but I may have had the same, not 100% yet but i lost a 13 member cluster in a similar config and seeing something similar in my logs.

2016-08-05T00:20:08.947Z|23436|bond|INFO|bond bond1: shift 44kB of load (with hash 34) from eth3 to eth2 (now carrying 954kB and 864kB load, respectively)
2016-08-05T00:20:29.031Z|23437|bond|INFO|bond bond1: shift 150kB of load (with hash 119) from eth2 to eth3 (now carrying 634kB and 784kB load, respectively)
2016-08-05T00:20:29.031Z|23438|bond|INFO|bond bond1: shift 157kB of load (with hash 3) from eth3 to eth2 (now carrying 626kB and 791kB load, respectively)
2016-08-05T00:20:29.031Z|23439|bond|INFO|bond bond1: shift 37kB of load (with hash 165) from eth2 to eth3 (now carrying 754kB and 663kB load, respectively)
2016-08-05T00:22:54.549Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
2016-08-05T00:22:54.555Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2016-08-05T00:22:54.555Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2016-08-05T00:22:54.556Z|00004|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.3.2
2016-08-05T00:22:54.698Z|00005|ofproto_dpif|INFO|system@ovs-system: Datapath supports recirculation
2016-08-05T00:22:54.698Z|00006|dpif|WARN|system@ovs-system: execute userspace(pid=0,userdata(00000000)) failed (Invalid argument) on packet metadata=0,in_port=0,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x1234
2016-08-05T00:22:54.698Z|00007|ofproto_dpif|WARN|system@ovs-system: variable-length userdata feature probe failed (Invalid argument)
2016-08-05T00:22:54.698Z|00008|dpif|WARN|system@ovs-system: failed to put[create] (Invalid argument) skb_priority(0),skb_mark(0),in_port(0),eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x8847),mpls(label=0,tc=0,ttl=0,bos=1)
2016-08-05T00:22:54.698Z|00009|ofproto_dpif|INFO|system@ovs-system: MPLS label stack length probed as 0
2016-08-05T00:22:54.764Z|00010|bridge|INFO|bridge vmbr1: added interface vmbr1 on port 65534
2016-08-05T00:22:54.764Z|00011|bridge|INFO|bridge vmbr1: using datapath ID 00008af15e7ce743
2016-08-05T00:22:54.764Z|00012|connmgr|INFO|vmbr1: added service controller "punix:/var/run/openvswitch/vmbr1.mgmt"
2016-08-05T00:22:54.877Z|00013|bridge|INFO|bridge vmbr1: added interface eth3 on port 1
2016-08-05T00:22:54.936Z|00014|bridge|INFO|bridge vmbr1: added interface eth2 on port 2
2016-08-05T00:22:54.938Z|00015|bridge|INFO|bridge vmbr1: using datapath ID 00000cc47abd1862
2016-08-05T00:22:55.524Z|00016|bridge|INFO|bridge vmbr1: added interface vlan457 on port 4
2016-08-05T00:22:57.939Z|00017|bond|INFO|interface eth2: link state up
2016-08-05T00:22:57.939Z|00018|bond|INFO|interface eth2: enabled
2016-08-05T00:22:57.939Z|00019|bond|INFO|bond bond1: active interface is now eth2
2016-08-05T00:22:58.289Z|00020|bond|INFO|interface eth2: link state down
2016-08-05T00:22:58.289Z|00021|bond|INFO|interface eth2: disabled
2016-08-05T00:22:58.289Z|00022|bond|INFO|bond bond1: all interfaces disabled
2016-08-05T00:23:00.491Z|00023|bond|INFO|interface eth2: link state up
2016-08-05T00:23:00.491Z|00024|bond|INFO|interface eth2: enabled
2016-08-05T00:23:00.491Z|00025|bond|INFO|interface eth3: link state up
2016-08-05T00:23:00.491Z|00026|bond|INFO|interface eth3: enabled
2016-08-05T00:23:00.491Z|00027|bond|INFO|bond bond1: active interface is now eth2
2016-08-05T00:23:00.505Z|00028|bond|INFO|interface eth3: link state down
2016-08-05T00:23:00.505Z|00029|bond|INFO|interface eth3: disabled
2016-08-05T00:23:04.556Z|00030|memory|INFO|182676 kB peak resident set size after 10.0 seconds
2016-08-05T00:23:04.557Z|00031|memory|INFO|handlers:14 ports:5 revalidators:6 rules:262 udpif keys:107
2016-08-05T00:23:05.499Z|00032|bond|INFO|interface eth3: link state up
2016-08-05T00:23:05.499Z|00033|bond|INFO|interface eth3: enabled
2016-08-05T00:41:20.922Z|00034|bond|INFO|interface eth3: disabled
2016-08-05T00:41:20.922Z|00035|bond|INFO|interface eth2: disabled
2016-08-05T00:41:20.922Z|00036|bond|INFO|bond bond1: all interfaces disabled
2016-08-05T00:41:20.941Z|00037|bridge|INFO|bridge vmbr1: using datapath ID 00008af15e7ce743
2016-08-05T00:41:21.301Z|00038|fatal_signal|WARN|terminating with signal 15 (Terminated)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!