netlink policy violation error while drbd + openvswitch still seems to work

ioo

Renowned Member
Oct 1, 2011
30
2
73
Hi!

I am using Proxmox v. 3.4 with 3.10 kernel doing 2node cluster with quorum disk + drbd and openvswitch. Everything seems to work fine but testing drbd we notice such behaviour

1. stopping to get clean starting point

root@proxmox-node1:~# /etc/init.d/drbd stop
Stopping all DRBD resources:.

root@proxmox-node1:~# /etc/init.d/drbd status
drbd not loaded

2. starting we get

root@proxmox-node1:~# /etc/init.d/drbd start
Starting DRBD resources:[
create res: r0
prepare disk: r0
adjust disk: r0
adjust net: r0
]
<1>netlink: policy violation t:6[6] e:-34

0: reply did not validate - do you need to upgrade your userland tools?
.

3. and quering drbd status it seems that resource started all right (this specific test was made with drbd.confing working in primary/secondary mode, but using primary/primary these messages are same and actually drbd works too with virtual computers after connecting and promoting drbd resource)

root@proxmox-node1:~# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
srcversion: 19422058F8A2D4AC0C8EF09
m:res cs ro ds p mounted fstype
0:r0 Connected Secondary/Secondary UpToDate/UpToDate C

drbd kernel module and userland versions match as i compiled drbd8-utils from source

root@proxmox-node1:~# modinfo drbd | grep ^version
version: 8.4.3
root@proxmox-node1:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ 89a294209144b68adb3ee85a73221f964d3ee515\ build\ by\ imre@proxmox-node1\,\ 2015-04-03\ 00:52:36
DRBDADM_API_VERSION=1
DRBD_KERNEL_VERSION_CODE=0x080403
DRBDADM_VERSION_CODE=0x080403
DRBDADM_VERSION=8.4.3

While starting drbd manually i.e. with drbdadm up r0 etc commands there are no such messages and everything is clean. Problem seems to be in /etc/init.d/drbd at line

$DRBDADM wait-con-int # User interruptible version of wait-connect all

I also read forum posts and saw that about half year ago people also had noticed such 'netlink policy violation' messages and even claimed that drbd + openvswitch is unusable together. I can confirm that stopping openvswitch this netlink policy violation goes away indeed. I guess i can do without open vswitch and use instead linux brctl bridge but i rather us ovs because i could easily gather netflow statistics and other goodies.


I would be thankful if you can comment on this problem, best regards

Imre

PS My drbd.conf is very basic, in this test

global {
usage-count no;
}

common {
syncer { rate 40M; }
}

resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
}

startup {
degr-wfc-timeout 120; # 2 minutes.
outdated-wfc-timeout 100;
# become-primary-on both;
}

disk {
on-io-error detach;
}

net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;

cram-hmac-alg "sha256";
shared-secret "xyz123";

# allow-two-primaries;
}

syncer {
rate 40M;
al-extents 257;
}

on proxmox-node1 {
device /dev/drbd0;
disk /dev/pve/r0;
address 10.1.222.1:7789;
meta-disk internal;
}

on proxmox-node2 {
device /dev/drbd0;
disk /dev/pve/r0;
address 10.1.222.2:7789;
meta-disk internal;
}
}
 
Hi again!

I forgot to mention that openvswitch is managing end user traffic (http, smtp, dns etc) and is working over eth0 and eth1; drbd is working over eth2 and its traffic is not part of OVS (probably just vanilla eth2 configured with ip address or some kind of 'modprobe bonding').


Imre