[SOLVED] Yet another bridge issue

dshawth

New Member
Sep 16, 2020
8
0
1
35
I am in the process of troubleshooting a single node in my cluster before expanding the configuration.

Each node has 4 NICs:
- enp5s0 is the primary NIC where the webui is attached (working properly)
- enp10s0 is an XG card for a separate SAN (working properly)
- enp8s0 and enp9s0 are a pair that my goal is to get bonded and used by the VMs to communicate leaving enp5s0 for the cluster sync and webui.

My current config is below. For now, enp9s0 is unused.

Code:
auto lo
iface lo inet loopback

iface enp5s0 inet manual

iface enp8s0 inet manual

iface enp9s0 inet manual

auto enp10s0
iface enp10s0 inet static
        address 172.16.4.1/29
        mtu 9000

auto vmbr0
iface vmbr0 inet static
        address 172.16.0.11/22
        bridge-ports enp5s0
        bridge-stp on
        bridge-fd 0

auto vmbr1
iface vmbr1 inet static
        address 172.16.1.1/22
        gateway 172.16.0.1
        bridge-ports enp8s0
        bridge-stp on
        bridge-fd 0
When I connect a VM to vmbr0, DHCP works.
When I connect the same VM to vmbr1, DHCP does not work.

I can ping both interfaces from external hosts.

I have tried with and without bridge-stp, since they are on the same segment, I believe bridge-stp is the safe answer.
I have tried with the gateway on both bridges, moving it has no effect.

Thanks in advance for pointing out my silly mistakes!
 
Last edited:

dshawth

New Member
Sep 16, 2020
8
0
1
35
Small update:

I can ping from vmbr0 like ping -I vmbr0 172.16.0.1 but not from vmbr1 like ping -I vmbr1 172.16.0.1.

Started looking into routing issues on the Proxmox node, ip route output is below:

Code:
default via 172.16.0.1 dev vmbr0 onlink
172.16.0.0/22 dev vmbr0 proto kernel scope link src 172.16.0.11
172.16.0.0/22 dev vmbr1 proto kernel scope link src 172.16.1.11
172.16.4.0/29 dev enp10s0 proto kernel scope link src 172.16.4.1
 
Last edited:

wigor

New Member
Dec 5, 2019
22
1
3
i think you dont want/need an ip on the second bridge - vmbr1. And if you want an ip, then probably not in the same subnet.
 

dshawth

New Member
Sep 16, 2020
8
0
1
35
When attempting ping -I vmbr1 172.16.0.1, arp seems to be the problem.

Code:
tcpdump -ennqti vmbr1 \( arp or icmp \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmbr1, link-type EN10MB (Ethernet), capture size 262144 bytes
68:1c:a2:13:2c:85 > ff:ff:ff:ff:ff:ff, ARP, length 42: Request who-has 172.16.0.1 tell 172.16.1.11, length 28
68:1c:a2:13:2c:85 > ff:ff:ff:ff:ff:ff, ARP, length 42: Request who-has 172.16.0.1 tell 172.16.1.11, length 28
68:1c:a2:13:2c:85 > ff:ff:ff:ff:ff:ff, ARP, length 42: Request who-has 172.16.0.1 tell 172.16.1.11, length 28
No responses received.
 

wigor

New Member
Dec 5, 2019
22
1
3
if it comes to arp "problems" with multible NICs on linux please use google for:

net.ipv4.conf.all.arp_ignore
net.ipv4.conf.all.arp_announce
net.ipv4.conf.all.arp_filter

i don´t know, if it is still a point with modern kernels, but i think so.
 

dshawth

New Member
Sep 16, 2020
8
0
1
35
@wigor I actually do not want an address on that interface, but simply removing it does not work. It might be a routing issue now, so trying a few things on that front. The downside is that I now have less of a troubleshooting plane on the Proxmox node (I cannot ping from an interface that does not have an IP).
 

dshawth

New Member
Sep 16, 2020
8
0
1
35
@wigor I have those set currently in /etc/sysctl.d/local.conf as follows:

Code:
net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2
 

wigor

New Member
Dec 5, 2019
22
1
3
as i understand, you want the VMs to use the bridge, so you must ping from the VMs.
Proxmox itself don´t "uses" the bridge in my opinion.
To the sysctl.conf: you must look yourself, it was only a hint. I do not know the right values for your situation.
 

dshawth

New Member
Sep 16, 2020
8
0
1
35
After another day of no success, I did a point-to-point using enp8s0 to the same port on another server. Since this process did not work, but the same thing is already working for the SAN on an Intel XG NIC, I started to suspect the NIC.

I plugged in a USB NIC and found it to immediately work.

Further investigating, I am attempting to use a dual NIC with the Realtek 8168 chip set on each server and Proxmox is applying the 8169 driver instead of the 8168 driver.

I used these cards because I had success in the past, but now that I have 8 of them of course there is an issue.

More to follow as the saga continues.
 

jtracy

Member
Aug 30, 2018
29
0
6
46
Why are you using 2 different VMBR interfaces if they are on the same broadcast domain?

You mentioned that you want to bond them but your configuration does not have any bonding setup.

If all you want is to have both interfaces enp5s0 enp8s0 on the same network then setup vmbr0 like the following and remove vmbr1

Code:
auto vmbr0
iface vmbr0 inet static
        address 172.16.0.11/22        
        gateway 172.16.0.1
        bridge-ports enp5s0  enp8s0
        bridge-stp on
        bridge-fd 0
 

spirit

Famous Member
Apr 2, 2010
4,335
292
103
www.odiso.com
Why are you using 2 different VMBR interfaces if they are on the same broadcast domain?

You mentioned that you want to bond them but your configuration does not have any bonding setup.

If all you want is to have both interfaces enp5s0 enp8s0 on the same network then setup vmbr0 like the following and remove vmbr1

Code:
auto vmbr0
iface vmbr0 inet static
        address 172.16.0.11/22       
        gateway 172.16.0.1
        bridge-ports enp5s0  enp8s0
        bridge-stp on
        bridge-fd 0
you'll have a loop in your network in this case. you need to use a bond here.
 

spirit

Famous Member
Apr 2, 2010
4,335
292
103
www.odiso.com
Small update:

I can ping from vmbr0 like ping -I vmbr0 172.16.0.1 but not from vmbr1 like ping -I vmbr1 172.16.0.1.
This is normal, as 172.16.0.1 is in vmbr0 network.

you can't ping it directly from vmbr1. (or maybe with enable forwarding/routing and vmbr1 will be routed internally through vmbr0).
 

jtracy

Member
Aug 30, 2018
29
0
6
46
you'll have a loop in your network in this case. you need to use a bond here.
For some reason I was thinking the ports were connected to different machines in my head.

But I still don't understand why he would want two bridges on the same subnet.



@dshawth Have you setup the switch that the systems are connected to to enable link aggregation on the ports for these systems?

You are also going to have to setup a bond interface on the proxmox host that the bridge will use as its interface.

Here is an example of an LCAP bond with your interfaces.
Code:
auto bond0
iface bond0 inet manual
    bond-slaves enp5s0  enp8s0
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3
    bond-min-links 1
Then your vmbr0 will look like this. Most configurations I have seen for virtual environments turn of STP (I think because it takes so long to enable forwarding on the port). Not sure STP is a requirement on your side or not.
Code:
auto vmbr0
iface vmbr0 inet static
        address 172.16.0.11/22        
        gateway 172.16.0.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
 

vikozo

Active Member
May 4, 2014
576
14
38
suisse
www.wombat.ch
But I still don't understand why he would want two bridges on the same subnet.
you will use the BOND in a same subnet but on two differen Switch.
The Switchport will have to be configured as Bond Port too.
One Switch is connected Normal the other with UPC and you are save this way
 

jtracy

Member
Aug 30, 2018
29
0
6
46
you will use the BOND in a same subnet but on two differen Switch.
The Switchport will have to be configured as Bond Port too.
One Switch is connected Normal the other with UPC and you are save this way
I think you said the same thing I did, but is still doesn't address the need of two bridges. You can't have the same bond as interfaces in multiple bridges unless you are using a vlan on the bond which he is not doing.
 

dshawth

New Member
Sep 16, 2020
8
0
1
35
Please consider solved. The physical cards were the issue. While they work on Proxmox for something like a PCAP, they do not work for bonds or a second vmbr. The below config worked just fine with Intel cards:

Code:
auto lo
iface lo inet loopback

auto enp5s0
iface enp5s0 inet static
        address 172.16.0.11/23
        gateway 172.16.0.1

iface enp6s0f0 inet manual

iface enp6s0f1 inet manual

auto enp7s0
iface enp7s0 inet static
        address 172.16.4.1/29
        mtu 9000

auto bond0
iface bond0 inet manual
        bond-slaves enp6s0f0 enp6s0f1
        bond-miimon 100
        bond-mode 802.3ad

auto vmbr0
iface vmbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!