bond0 works only in round-robin. BUG?

valshare

Renowned Member
Jun 2, 2009
257
2
83
Germany
Hello,

i have create a bond0 device and switched the modus in 802.3ad. The Switch is configured to use LACP. Both ports are an the same LAG.

But if i check the status from bond0, the device will only work in round-robin. Whats goin wrong?


Code:
proxmod02:~# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.2.3 (December 6, 2007)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:19:d3:4c:35

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:19:d3:4c:37
proxmod02:~#

Regards, valle
 
Hi,

i have installed Proxmox an 3 Server for testing. All servers run in Cluster Mode and all have the same bond0 issue.

Code:
proxmod02:~# cat /etc/network/interfaces
# network interface settings
auto lo
iface lo inet loopback

iface eth0 inet manual

iface eth1 inet manual

iface eth2 inet manual

iface eth3 inet manual

auto bond0
iface bond0 inet manual
    slaves eth0 eth1
    bond_miimon 100
    bond_mode 802.3ad

auto vmbr0
iface vmbr0 inet static
    address  192.168.1.3
    netmask  255.255.255.0
    gateway  192.168.1.1
    bridge_ports bond0
    bridge_stp off
    bridge_fd 0
 
Code:
proxmod02:~# cat /sys/class/net/bond0/bonding/mode
balance-rr 0

And yes, found the follow:

Code:
proxmod02:~# dmesg | grep bonding
bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
bonding: bond0: Ignoring invalid mode value 802.3ad.
bonding: bond0: Setting MII monitoring interval to 100.
bonding: bond0: enslaving eth0 as an active interface with a down link.
bonding: bond0: enslaving eth1 as an active interface with a down link.
bonding: bond0: link status definitely up for interface eth1.
bonding: bond0: link status definitely up for interface eth0.
proxmod02:~#
 
OK, its a kernel bug - i will try to compile a fix.

As workaround set 'bond_mode 4' (edit /etc/network/interfaces)
 
Please can you test the new kernel:

Code:
wget ftp://pve.proxmox.com/debian/dists/lenny/pvetest/binary-amd64/pve-kernel_2.6.24-9_amd64.deb
wget ftp://pve.proxmox.com/debian/dists/lenny/pvetest/binary-amd64/pve-kernel-2.6.24-7-pve_2.6.24-9_amd64.deb
dpkg -i pve-kernel_2.6.24-9_amd64.deb pve-kernel-2.6.24-7-pve_2.6.24-9_amd64.deb
 
Now, next problem.

Port Trunking under Proxmox Debian works now. The ports are all grouped to one "big" port. But if i group the Ports on my Dell PowerConnect Switch and enable 802.3ag, the server isn´t reachable to the network. Only if i delete the vmbr0 and give the bond0 interface an IP, the server is reachable again and 802.3.ag works.

Can anyone help please? Why i cant get the Proxmox and vmbr0 get to work if i move the bond0 into the vmbr0 interface?

Regards, valle
 
Hi,

I have the same problem with my Sunfire 4150 servers and HP Procurve 4208 switches. The switch reports that the trunks are active but I lost network connection in my guests. I gave up a while ago as lacp trunking would have been nice to have but not essential as I have a few ports so I could assign a dedicated bridge to a busy vm. I also struggled with ALB and TLB aggregation. When I googled the problem i found a lot of issues related to mac addresses being passed through from the bridge to the bond. I concluded from this that is was not a proxmox specific issue as a lot of the forums referred to XEN virtualisation.

I anyone has got a fix it would be really handy as I could assign all my nics to one big fat pipe and only use one bridge for my VMs, also cluster/backup operations would be faster.

regardless, this is a minor quibble for me and the rest of promox ve is fantastic.

JS
 
Hello jsquire,

same here. trunks reports that is active an up. But ports from the vms and pve not reachable. Only if i give the bond0 device an ip, than it works. But then, i cant give the vms a ip.

Code:
      1      2147482910     11~Jun~2009 01:08:34     Info     %TRUNK-I-PORTADDED: Port g16 added to ch3            
      2      2147482911     11~Jun~2009 01:08:34     Info     %TRUNK-I-PORTADDED: Port g15 added to ch3            
      3      2147482912     11~Jun~2009 01:08:34     Info     %LINK-I-Up:  ch3            
      4      2147482913     11~Jun~2009 01:08:34     Info     %TRUNK-I-PORTADDED: Port g14 added to ch3            
      5      2147482914     11~Jun~2009 01:08:34     Info     %TRUNK-I-PORTADDED: Port g13 added to ch3            
      6      2147482915     11~Jun~2009 01:08:22     Info     %LINK-I-Up:  g16            
      7      2147482916     11~Jun~2009 01:08:22     Info     %LINK-I-Up:  g15            
      8      2147482917     11~Jun~2009 01:08:22     Info     %LINK-I-Up:  g14            
      9      2147482918     11~Jun~2009 01:08:21     Info     %LINK-I-Up:  g13
 
Hi,

It maybe possible to link the aggregated bond to a vm via the command line. I don't really want to go down that route as my work colleges would struggle to follow what I've done if I'm not using the web gui.

I'm sure others may have some other suggestions as this forum is a pretty good source of information.


JS
 
Hi,

thanx for reply. What i didnt understand from the technical side. Why did it work with the bond0 interface but didn´t work with a bridged interface.

I thing link aggregation isnt a stable solution. At my home envoirment i have a netgear switch that support 802.3ag and my NAS TheCus 5200pro hat 802.3ag, too. If switch both to 802.ag it will work and the "most" clients works, too. But if i want to the nas with my stb Dreambox 7025, i cant access the NAS. Strange .....

Regards, Valle
 
Hi,

I've have also had some very strange results using port aggregation. Even though it is a ratified standard, the best results I have had is using it with equipment from the same manufacturer. Eg. trunking on my works HP switches.

I don't think it is specifically a proxmox problem, more a linux problem.


JS
 
Hi,

thanx for reply. I use Hardware from same manufacturer. Both are dell hardware. The bond0 interfaces are working but if i use vmbr0, then the link aggregation didn´t work anymore.

Regards, Valle
 
I've the same problem here in v.3

My server is a Dell PowerEdge SC1435 and my switch is a Cisco 2960G.

I've tried to set bond_mode 4 but now the switch has blocked the network ports of this server...

Your way to set the bonding settings is strange because it won't work on a Debian lenny installation...
I alway used this http://www.debianhelp.co.uk/bonding.htm and it worked perfectly.
 
Your way to set the bonding settings is strange because it won't work on a Debian lenny installation...

What exacltly is 'strange'? I use standard lenny tools 'ifenslave' - so why do you think it does not work on standard lenny?
 
What exacltly is 'strange'? I use standard lenny tools 'ifenslave' - so why do you think it does not work on standard lenny?

I mean, I've tried this way (which is much more logic than the way I've found) on a netinstall debian lenny (with ifenslave-2.6) but the switch doesn't see LACP and the network is blocked on the server.

But my problem is not the way to do it, I just want it to work.....
For now I've set active-passive but my VMs lost their network connexion for 1-2 minutes randomly I don't understand why.

Can you help me what to do ?
Many thanks for your help !