Openvswitch Crash

dazza76

Renowned Member
May 25, 2010
41
0
71
Hello all,
I have built a test cluster using 4 machines and OVS
I have used the example config from https://pve.proxmox.com/wiki/Open_vSwitch with bonding .

What i am seeing is when I add or remove a port/vlan (i.e start a container) the OVS Seg faults

i have debugged it down to the
/usr/bin/ovs-vsctl del-port veth101.0
/usr/bin/ovs-vsctl add-port vmbr0 veth101.0 tag=10

015-04-23T03:27:53.790Z|05941|netdev_linux|WARN|veth101.2: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:27:53.790Z|05942|netdev_linux|WARN|veth101.1: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:27:53.791Z|05943|netdev_linux|WARN|veth101.0: obtaining netdev stats via vport failed (No such device)
2015-04-23T03:28:07.851Z|05944|bridge|WARN|could not open network device veth101.2 (No such device)
2015-04-23T03:28:07.853Z|05945|bridge|WARN|could not open network device veth101.0 (No such device)
2015-04-23T03:28:07.854Z|05946|bridge|INFO|bridge vmbr0: using datapath ID 00003440b58020c4
2015-04-23T03:28:07.879Z|05947|bond|INFO|interface eth1: link state down
2015-04-23T03:28:07.879Z|05948|bond|INFO|interface eth1: disabled
2015-04-23T03:28:08.798Z|05949|bond|INFO|interface eth0: link state down
2015-04-23T03:28:08.798Z|05950|bond|INFO|interface eth0: disabled
2015-04-23T03:28:08.798Z|05951|bond|INFO|bond bond0: all interfaces disabled
2015-04-23T03:28:08.964Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 1966 died, killed (Segmentation fault), core dumped, restarting
2015-04-23T03:28:08.966Z|00003|memory|INFO|6688 kB peak resident set size after 7588.4 seconds
2015-04-23T03:28:08.966Z|00004|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2015-04-23T03:28:08.966Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2015-04-23T03:28:09.064Z|00006|ofproto_dpif|INFO|system@ovs-system: Datapath supports recirculation
2015-04-23T03:28:09.064Z|00007|dpif|WARN|system@ovs-system: execute userspace(pid=0,userdata(00000000)) failed (Invalid argument) on packet metadata=0,in_port=0,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x1234
2015-04-23T03:28:09.064Z|00008|ofproto_dpif|WARN|system@ovs-system: variable-length userdata feature probe failed (Invalid argument)
2015-04-23T03:28:09.064Z|00009|dpif|WARN|system@ovs-system: failed to put[create] (Invalid argument) skb_priority(0),skb_mark(0),in_port(0),eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x8847),mpls(label=0,tc=0,ttl=0,bos=1)
2015-04-23T03:28:09.064Z|00010|ofproto_dpif|INFO|system@ovs-system: MPLS label stack length probed as 0


it takes 30-70 seconds for this to come back up.
Any ideas on how i can resolve this ,

I have tried downgrading from 2.3.1-1 to 2.0.90-4 with no luck

ATM I have had to switch LACP off

Cheers
 
Last edited:
Reviving a zombie thread i know, but I may have had the same, not 100% yet but i lost a 13 member cluster in a similar config and seeing something similar in my logs.

2016-08-05T00:20:08.947Z|23436|bond|INFO|bond bond1: shift 44kB of load (with hash 34) from eth3 to eth2 (now carrying 954kB and 864kB load, respectively)
2016-08-05T00:20:29.031Z|23437|bond|INFO|bond bond1: shift 150kB of load (with hash 119) from eth2 to eth3 (now carrying 634kB and 784kB load, respectively)
2016-08-05T00:20:29.031Z|23438|bond|INFO|bond bond1: shift 157kB of load (with hash 3) from eth3 to eth2 (now carrying 626kB and 791kB load, respectively)
2016-08-05T00:20:29.031Z|23439|bond|INFO|bond bond1: shift 37kB of load (with hash 165) from eth2 to eth3 (now carrying 754kB and 663kB load, respectively)
2016-08-05T00:22:54.549Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
2016-08-05T00:22:54.555Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2016-08-05T00:22:54.555Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2016-08-05T00:22:54.556Z|00004|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.3.2
2016-08-05T00:22:54.698Z|00005|ofproto_dpif|INFO|system@ovs-system: Datapath supports recirculation
2016-08-05T00:22:54.698Z|00006|dpif|WARN|system@ovs-system: execute userspace(pid=0,userdata(00000000)) failed (Invalid argument) on packet metadata=0,in_port=0,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x1234
2016-08-05T00:22:54.698Z|00007|ofproto_dpif|WARN|system@ovs-system: variable-length userdata feature probe failed (Invalid argument)
2016-08-05T00:22:54.698Z|00008|dpif|WARN|system@ovs-system: failed to put[create] (Invalid argument) skb_priority(0),skb_mark(0),in_port(0),eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x8847),mpls(label=0,tc=0,ttl=0,bos=1)
2016-08-05T00:22:54.698Z|00009|ofproto_dpif|INFO|system@ovs-system: MPLS label stack length probed as 0
2016-08-05T00:22:54.764Z|00010|bridge|INFO|bridge vmbr1: added interface vmbr1 on port 65534
2016-08-05T00:22:54.764Z|00011|bridge|INFO|bridge vmbr1: using datapath ID 00008af15e7ce743
2016-08-05T00:22:54.764Z|00012|connmgr|INFO|vmbr1: added service controller "punix:/var/run/openvswitch/vmbr1.mgmt"
2016-08-05T00:22:54.877Z|00013|bridge|INFO|bridge vmbr1: added interface eth3 on port 1
2016-08-05T00:22:54.936Z|00014|bridge|INFO|bridge vmbr1: added interface eth2 on port 2
2016-08-05T00:22:54.938Z|00015|bridge|INFO|bridge vmbr1: using datapath ID 00000cc47abd1862
2016-08-05T00:22:55.524Z|00016|bridge|INFO|bridge vmbr1: added interface vlan457 on port 4
2016-08-05T00:22:57.939Z|00017|bond|INFO|interface eth2: link state up
2016-08-05T00:22:57.939Z|00018|bond|INFO|interface eth2: enabled
2016-08-05T00:22:57.939Z|00019|bond|INFO|bond bond1: active interface is now eth2
2016-08-05T00:22:58.289Z|00020|bond|INFO|interface eth2: link state down
2016-08-05T00:22:58.289Z|00021|bond|INFO|interface eth2: disabled
2016-08-05T00:22:58.289Z|00022|bond|INFO|bond bond1: all interfaces disabled
2016-08-05T00:23:00.491Z|00023|bond|INFO|interface eth2: link state up
2016-08-05T00:23:00.491Z|00024|bond|INFO|interface eth2: enabled
2016-08-05T00:23:00.491Z|00025|bond|INFO|interface eth3: link state up
2016-08-05T00:23:00.491Z|00026|bond|INFO|interface eth3: enabled
2016-08-05T00:23:00.491Z|00027|bond|INFO|bond bond1: active interface is now eth2
2016-08-05T00:23:00.505Z|00028|bond|INFO|interface eth3: link state down
2016-08-05T00:23:00.505Z|00029|bond|INFO|interface eth3: disabled
2016-08-05T00:23:04.556Z|00030|memory|INFO|182676 kB peak resident set size after 10.0 seconds
2016-08-05T00:23:04.557Z|00031|memory|INFO|handlers:14 ports:5 revalidators:6 rules:262 udpif keys:107
2016-08-05T00:23:05.499Z|00032|bond|INFO|interface eth3: link state up
2016-08-05T00:23:05.499Z|00033|bond|INFO|interface eth3: enabled
2016-08-05T00:41:20.922Z|00034|bond|INFO|interface eth3: disabled
2016-08-05T00:41:20.922Z|00035|bond|INFO|interface eth2: disabled
2016-08-05T00:41:20.922Z|00036|bond|INFO|bond bond1: all interfaces disabled
2016-08-05T00:41:20.941Z|00037|bridge|INFO|bridge vmbr1: using datapath ID 00008af15e7ce743
2016-08-05T00:41:21.301Z|00038|fatal_signal|WARN|terminating with signal 15 (Terminated)