2.6.32-14 causing bonding and/or vlan issues

e100

Renowned Member
Nov 6, 2010
1,268
46
88
Columbus, Ohio
ulbuilder.wordpress.com
update to the latest packages today, after rebooting none of my vlan bridges worked.

/etc/network/interfaces:
Code:
iface eth0 inet manual

iface eth1 inet manual

auto bond0
iface bond0 inet manual 
        pre-up modprobe bonding
        mtu 1500
        slaves eth0 eth1
        bond_primary eth0
        bond_miimon 100
        bond_downdelay 200
        bond_updelay 400
        bond_mode active-backup
        
auto vmbr0
iface vmbr0 inet static 
        address 192.168.X.X
        netmask 255.255.255.0
        gateway 192.168.X.X
        bridge_ports bond0
        bridge_stp off

iface bond0.4 inet manual
vlan-raw-device bond0

auto vmbr4
iface vmbr4 inet manual
        bridge_ports bond0.4
        bridge_stp off
        bridge_fd 0

iface bond0.3 inet manual
vlan-raw-device bond0

auto vmbr3
iface vmbr3 inet manual
        bridge_ports bond0.3
        bridge_stp off
        bridge_fd 0

Reboot back into 2.6.32-13 and everything works fine.

Everything seems to come up ok on 2.6.32-14, the issue is none of the vlan associated bridges seem to pass any traffic.
openvz container and KVM machines that use vmbr3 or vmbr4 can not send or receive any data on the network.
Maybe some sysctl setting default changed between versions?

I poked around for a bit but I have not been able to find the cause of this problem yet, any help would be appreciated.



EDIT:
If I remove the bond interface it works fine.
It appears that my bond setup above is not passing the vlan tags in the new kernel but it worked fine in every previous kernel even in 1.9.
 
Last edited:
update to the latest packages today, after rebooting none of my vlan bridges worked.

/etc/network/interfaces:
Code:
iface eth0 inet manual

iface eth1 inet manual

auto bond0
iface bond0 inet manual 
        pre-up modprobe bonding
        mtu 1500
        slaves eth0 eth1
        bond_primary eth0
        bond_miimon 100
        bond_downdelay 200
        bond_updelay 400
        bond_mode active-backup
        
auto vmbr0
iface vmbr0 inet static 
        address 192.168.X.X
        netmask 255.255.255.0
        gateway 192.168.X.X
        bridge_ports bond0
        bridge_stp off

iface bond0.4 inet manual
vlan-raw-device bond0

auto vmbr4
iface vmbr4 inet manual
        bridge_ports bond0.4
        bridge_stp off
        bridge_fd 0

iface bond0.3 inet manual
vlan-raw-device bond0

auto vmbr3
iface vmbr3 inet manual
        bridge_ports bond0.3
        bridge_stp off
        bridge_fd 0

Reboot back into 2.6.32-13 and everything works fine.

Everything seems to come up ok on 2.6.32-14, the issue is none of the vlan associated bridges seem to pass any traffic.
openvz container and KVM machines that use vmbr3 or vmbr4 can not send or receive any data on the network.
Maybe some sysctl setting default changed between versions?

I poked around for a bit but I have not been able to find the cause of this problem yet, any help would be appreciated.



EDIT:
If I remove the bond interface it works fine.
It appears that my bond setup above is not passing the vlan tags in the new kernel but it worked fine in every previous kernel even in 1.9.
Hi,
just see your EDIT - my setup work also well because it's without bonding...

Udo
 
Seems to be an arp issue, not vlan issue.

I have a container 192.168.10.103 that is trying to ping another container 192.168.10.105 on a different proxmox node
I can see the arp request going out in bond0 and eth0 (current active interface in the bond):
Code:
16:28:27.472068 f2:7d:13:54:26:b2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 3, p 0, ethertype ARP, Request who-has 192.168.10.105 tell 192.168.10.103, length 28

On the proxmox node that has the container 192.168.10.105 I am trying to ping I can see the arp request and reply:
Code:
16:32:34.911367 f2:7d:13:54:26:b2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 3, p 0, ethertype ARP, Request who-has 192.168.10.105 tell 192.168.10.103, length 46
16:32:34.911394 a2:8b:d6:7e:17:d7 > f2:7d:13:54:26:b2, ethertype 802.1Q (0x8100), length 46: vlan 3, p 0, ethertype ARP, Reply 192.168.10.105 is-at a2:8b:d6:7e:17:d7, length 28

The reply never shows up on the source node with the new kernel.

Next I disabled the switch port that eth0 is connected into.
The bonding failed over to eth1 as expected and everything started working.

So the issue is related to my eth0 network card, likely some change in the driver.
was detected as:
RTL8168b/8111b

now it is detected as:
RTL8168e/8111e

One reason not to trust realtek
 
I have same issue, but with Broadcom cards. With this kernel I have to remove bond interface to get it working.


At boot time the requested firm is different.
With 2.6.32-13:

Code:
Aug 17 14:01:16 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-mips-06-6.2.1.fw
Aug 17 14:01:16 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-rv2p-06-6.0.15.fw


And with 2.6.32-14:
Code:
Aug 17 11:20:56 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-mips-06-6.2.3.fw
Aug 17 11:20:56 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-rv2p-06-6.0.15.fw
 
I have same issue, but with Broadcom cards. With this kernel I have to remove bond interface to get it working.


At boot time the requested firm is different.
With 2.6.32-13:

Code:
Aug 17 14:01:16 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-mips-06-6.2.1.fw
Aug 17 14:01:16 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-rv2p-06-6.0.15.fw


And with 2.6.32-14:
Code:
Aug 17 11:20:56 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-mips-06-6.2.3.fw
Aug 17 11:20:56 nhprox02 kernel: bnx2 0000:07:00.0: firmware: requesting bnx2/bnx2-rv2p-06-6.0.15.fw


same here with bnx2 card.
vlan over bonding don't work.
 
Please redirect all thanks to Alexandre (spirit) - he is the hero today :)


Of course, thanks to spirit, for his great job analysing this issue and finding the patch, and to e100 for putting this issue in the right direction :)
 
Please redirect all thanks to Alexandre (spirit) - he is the hero today :)

This forum is fantastic, all contributing !!!

Please anybody can help me? !!!

The Saturday i will be installing PVE (his iso Installer) on 2 "DELL poweredge 2950" and 2 "DELL poweredge r710" with 4 NICs Broadcom on each node

I don't access to https://bugzilla.redhat.com/show_bug.cgi?id=834764 , then I can not know which models are implicated

The NICs are (two units for each model and on each server):
Broadcom NetXtreme II BCM5708 Gigabit Ethernet
Broadcom NetXtreme BCM5721 Gigabit Ethernet PCI Express

I have the latest version of PVE on my personal computer
:
pve-manager: 2.2-26 (pve-manager/2.2/c1614c8c)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-80
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-80
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-1
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-28
qemu-server: 2.0-64
pve-firmware: 1.0-21
libpve-common-perl: 1.0-37
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-34
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

Then let me to do a questions:
1- Will i have problems for do NICs bonding?
2- If the answer is yes, what should I do to make it work?
3- If the advice is download pve-kernel-2.6.32-14-pve_2.6.32-74_amd64.deb, have bugs this kernel that will make my PVE unstable for use it in a production enviroment?

I will be very grateful to who can help me

Best regards
Cesar
 
This forum is fantastic, all contributing !!!

Please anybody can help me? !!!

The Saturday i will be installing PVE (his iso Installer) on 2 "DELL poweredge 2950" and 2 "DELL poweredge r710" with 4 NICs Broadcom on each node

I don't access to https://bugzilla.redhat.com/show_bug.cgi?id=834764 , then I can not know which models are implicated

The NICs are (two units for each model and on each server):
Broadcom NetXtreme II BCM5708 Gigabit Ethernet
Broadcom NetXtreme BCM5721 Gigabit Ethernet PCI Express

I have the latest version of PVE on my personal computer
:
pve-manager: 2.2-26 (pve-manager/2.2/c1614c8c)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-80
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-80
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-1
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-28
qemu-server: 2.0-64
pve-firmware: 1.0-21
libpve-common-perl: 1.0-37
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-34
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

Then let me to do a questions:
1- Will i have problems for do NICs bonding?
2- If the answer is yes, what should I do to make it work?
3- If the advice is download pve-kernel-2.6.32-14-pve_2.6.32-74_amd64.deb, have bugs this kernel that will make my PVE unstable for use it in a production enviroment?

I will be very grateful to who can help me

Best regards
Cesar

I'm running R710 and 2950 in production without any problem, with last kernel (2.6.32-16).

I'm using bonding, active-backup and also lacp.

more info on bonding here:
http://pve.proxmox.com/wiki/Network_Model
 
I'm running R710 and 2950 in production without any problem, with last kernel (2.6.32-16).

I'm using bonding, active-backup and also lacp.

more info on bonding here:
http://pve.proxmox.com/wiki/Network_Model

Ohhh spirit, that's good, and let me to do two questions for understand better:

1- I must be understand that this kernel (2.6.32-16) have the new patch with it, right?
2- If what I think is right, my kernel "pve-kernel-2.6.32-16-pve: 2.6.32-80" is better that the kernel installer "pve-kernel-2.6.32-14-pve_2.6.32-74_amd64.deb" that Dietmar said above, therefore the patch is included. right?

I want to thanks to you and to all who share their knowledges and experiences that of a great way help those less knowledgeable. :p

Best regards
Cesar
 
Last edited:
Ohhh spirit, that's good, and let me to do two questions for understand better:

1- I must be understand that this kernel (2.6.32-16) have the new patch with it, right?
2- If what I think is right, my kernel "pve-kernel-2.6.32-16-pve: 2.6.32-80" is better that the kernel installer "pve-kernel-2.6.32-14-pve_2.6.32-74_amd64.deb" that Dietmar said above, therefore the patch is included. right?

I want to thanks to you and to all who share their knowledges and experiences that of a great way help those less knowledgeable. :p

Best regards
Cesar

1,2. Yes, the patch is included officilaly in the last kernel, so problem for bonding ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!