Second vmbr and NIC defeats the first in terms of WebUI, SSH

MarioG

Member
Sep 29, 2023
12
0
6
I have a host with 4 NICs. The first NIC was assigned as nic0, and to vmbr0, with a CIDR of 192.168.50.40/24. I can access the Web UI and SSH into the host fine.
I then configured nic1 with vmbrWS with a CIDR of 10.10.10.40/24 and no default gateway. I want the host to access an NFS share on the 10.10.10.x network.
nic0 is physically connected to the 192.168.50.x switch, and nic1 is physically connected to the 10.10.10.x switch.
Once I apply the settings, I lose the webUI and SSH access to 192.168.50.40, but can then access them through 10.10.10.40.
But I want my webUI and cluster qdevice on the 192.168.50.x network (which is for servers).
I read that there were ARP settings for a multi-homed host and try changing some settings there, without luck.

So, how can I have the web ui, ssh, and cluster devices all on the 192.168.50.x network (vmbr0 and nic0), but still have the proxmox host access an NFS on the 10.10.10x network (vmbrWS, nic1, CIDR 10.10.10.43/0 with no default gateway)?

Note: When I had the Linux bonds like vmbrWS setup with no CIDR, it worked great for VMs. I could have a VM that had 2 network cards, each tied to a different network, and the vm could communicate with both networks. It's just the ProxMox host talking to more than one network that gives me a problem.

Thanks in advance.
 
Why do you need to give the Proxmox host an IP address on more than one virtual bridge? Can you show the actual /etc/network/interface (in CODE-tags) that you want and is giving your problems? EDIT: Are you adjusting the /etc/hosts file as well for the multiple IP addresses for the Proxmox host?
 
Last edited:
Thank you for your response.

I want a second IP address because the second virtual bridge is connected to a second physical NIC and network:
#1 I want the webui and ssh access on 192.168.50.43 (a physical switch and network card)
#2 I want to map an NFS share from the ProxMox host that is hosted on the 10.10.10.x network (also a physical switch and NIC), so I can do backups (simple ones, not PBS yet) to that device.

For background, I am trying to phase out VMWare where I have vmkernel devices to both networks, and the ESXi host can access both networks (192.168.50.x and 10.10.10.x) as well as VMs mapped to virtual switches link to physical NICs. VMware is setup to accomplish #1 and #2.

This is the /etc/network/interfaces file that works fine for #1 (nic1 and vmbrWS isn't used by VMs yet), but no NFS access (breaks #2):

Code:
auto lo
iface lo inet loopback

iface nic0 inet manual

iface nic1 inet manual

iface nic2 inet manual

iface nic3 inet manual

iface nic4 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.50.43/24
        gateway 192.168.50.1
        bridge-ports nic0
        bridge-stp off
        bridge-fd 0

auto vmbrWS
iface vmbrWS inet manual
        bridge-ports nic1
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

When I add an IP address to vmbrWS, I can no longer access the web ui or SSH on the 192.168.50.43 address (breaking #1). I can however see the NFS server on 10.10.10.x. I can access the webUI and SSH from 10.10.10.x machines, but I don't want that. The /etc/network/interfaces file looks like this:

Code:
auto lo
iface lo inet loopback

iface nic0 inet manual

iface nic1 inet manual

iface nic2 inet manual

iface nic3 inet manual

iface nic4 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.50.43/24
        gateway 192.168.50.1
        bridge-ports nic0
        bridge-stp off
        bridge-fd 0

auto vmbrWS
iface vmbrWS inet static
        address 10.10.10.43/24
        bridge-ports nic1
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

I understand that the second virtual switch network should not have a gateway, and it doesn't. I've read that ARP routing needs to be corrected in a scenario like this, but the settings I found don't seem to correct the issue.

I am on Proxmox 9.0.11 with a new install. I want to make sure all the networking is correct before I add my second node and disaster recovery host machines, which will all need similar networking setup (all have 4 NICs and connections to physical switches. I will also be adding a qdevice that I want on that same 192.168.50.x network but obviously won't need 10.10.10.x on that.

Here is an example of adding NFS storage seeing the NFS server on 10.10.10.x when I add the 10.10.10.43 IP to the second virtual switch vmbrWS. Without it, the Export list is not populated.
1761258281621.png
So, I want my Proxmox host to be able to access 2 networks. I have no problem doing this from VMs that are linked to each virtual switch (and have an appropriate IP address).

Maybe I should ask a simpler question: With Proxmox 9.0.11, with 2 network cards on 2 physical networks, can I give the ProxMox access to both networks without an external router/firewall? Any advice is appreciated.
 
Last edited:
Just to add a comment: I did not change the hosts file. I am not referring to anything by name yet, only IP address. Once I get the IP addresses working, I will add certificates as needed and define host names (for things like each ProxMox host and the NFS server). Currently, the hosts file lists 192.168.50.43 for the host name (fqdn and the short version).
 
I am not using such config variant, so i can theoretize it looks ok.
But i am using vlans everywhere and never assign ip to bridge, but using subinterface every time.

Anyway, PVE can access multiple networks without fw/router.
For nfs access you even don't need to have bridge - if physical card is for only one subnet.
 
Thank you for your response czechsys. The physical card, nic1, is only for the 10.10.10.x network, physically connected to a switch dedicated to 10.10.10.x.

I also tried just assigning the IP address to nic1 and no bridge. While this restored the web ui and ssh on 192.168.50.43, I then still couldn't get a list of the NFS shares (like I could in the attached image), and I can't create VMs that connect to that network.

I agree that ProxMox can access multiple networks without the fw/router, as I was able to create VMs where the VMs could access both vmbr0 and vmbrWS as long as they have 2 NICs in their hardware config. The problem starts when I try Datacenter - Storage - Add NFS for a Synology NAS at 10.10.10.24. Without the IP address on the vmbrWS bridge, the VMs could access the same NFS device that the ProxMox host can't.

I am going to say that while familiar with VLANs on switches, I have never used one via ProxMox, but my first test went badly: I removed the vmbrWS bridge and added a Linux VLAN named vlanWS, linked it (vlan-raw-device) to nic1, IP (address) 10.10.10.43/24 and vlan-id of 1. When I applied the networking changes, then I lost Web UI and SSH access via both networks (192.168.50.x and 10.10.10.x). I had to resort to a remote IP KVM device to regain control of the host and undo the changes.

So still have a problem where I can't achieve my goals:
1. Connect to 192.168.50.43 for Web UI, ssh (vmbr0 and nic0 w/connection to physical 192.168.50.x switch)
2. Allow VMs to also access that same network, via vmbr0.
3. Allow VMs to access the 10.10.10.x network (vmbrWS and nic1 w/connection to physical 10.10.10.x switch)
4. Allow the ProxMox host to connect to an NFS share on a Synology located at 10.10.10.24 (I had thought with a vmbrWS IP).

I can get any of the above working fine, except I can't get all 4 working correctly, and #4 is where things break.

I would even be willing to make 2 NICs connect to the 10.10.10.x network, one for VMs and one for the host, but I assume as soon as I gave that new bridge an IP address to connect, then I would be back in the same boat. I will only need 3 actual NICs for 3 networks, and I have 5 NICs.

Again, any input is appreciated.
 
@MarioG is the machine you are using to access Proxmox on a different subnet? I feel I'm having a very similar issue to yours. I can access the gui if my machine is on the same subnet, but as soon as I move to the other I loose everything, gui, ping, etc... But as soon as I remove that second vmbr/NIC I can reach it all from that different subnet??? No firewall rules in place on Proxmox or on my new network (allow all) to cause issues. I can see other non Proxmox machines fine between the two subnets. I also have another Proxmox machine with the same issue!
 
Thank you @budney76, You may be on to something.

Your post made me discover I was wrong: The 192.168.50.43 address didn't become inaccessible to all, it only became inaccessible from the 10.10.10.x network.

First, when I connect to the vmbr0/nic0 on the 192.168.50.43 address, I can connect to it from any device on the 192.168.50.x network, regardless of vmbrWS/nic0. I can also connect to it from any device on the 10.10.10.x network, because of the physical Watchguard firewall routing between the networks.

Here is an example tracert from a 10.10.10.x device:
Code:
C:\Users\Mario>tracert 192.168.50.43

Tracing route to 192.168.50.43 over a maximum of 30 hops

  1    <1 ms    <1 ms    <1 ms  10.10.10.1
  2    <1 ms    <1 ms    <1 ms  192.168.50.43

10.10.10.1 is my physical Watchguard firewall. And from a 10.10.10.x device, I can ping the IP address:
Code:
C:\Users\Mario>ping 192.168.50.43

Pinging 192.168.50.43 with 32 bytes of data:
Reply from 192.168.50.43: bytes=32 time<1ms TTL=64
Reply from 192.168.50.43: bytes=32 time<1ms TTL=64

And without IP/CIDR on vmbrWS, then I get this from the 10.10.10.x device:
Code:
C:\Users\Mario>curl https://pvedr1.hiddendomain.com:8006
<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
    <title>pvedr1 - Proxmox Virtual Environment</title>

But when I add the IP/CIDR to the ProxMox host, I can't curl it anymore:
Code:
C:\Users\Mario>curl https://pvedr1.hiddendomain.com:8006
curl: (35) Recv failure: Connection was reset

So, the only change between the 2 curl commands above, was the first curl was done without the IP/CIDR on vmbrWS, and the second was a minute later with the only change being to add the IP/CIDR to vmbrWS.

But an ESXi host on the 192.168.50.x network can be accessed:
Code:
C:\Users\Mario>curl -k https://esx7b.hiddendomain.com
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

<html lang="en">
<head>
    <meta http-equiv="content-type" content="text/html; charset=utf8">
    <meta http-equiv="refresh" content="0;URL='/ui'"/>
</head>
</html>

To summarize my misunderstanding: I thought that with IP/CIDR on vmbrWS that the WebUI on 192.168.50.43 became inaccessible to all when it was actually only inaccessible to devices on 10.10.10.x.
 
Last edited:
Very interesting! Looks like the same issue to me, ping and tracert look similar too. Hopefully someone will chime in who might know a fix, this is over my head.
 
I may have a fix, at least for me. I found this in my Watchguard firewall logs:
2025-10-25 04:17:28 Deny 10.10.10.123 192.168.50.43 8006/tcp 54333 8006 Trusted Firebox tcp invalid connection state 40 128 (Internal Policy) proc_id="firewall" rc="101" msg_id="3000-0148" tcp_info="offset 5 A 4171839985 win 65280"

Note this didn't appear when IP/CIDR was not in vmbrWS. So I believe this has to do with ARP routing:

With IP/CIDR of 10.10.10.43/24 in vmbrWS, the ProxMox Linux kernel saw that 10.10.10.123 was directly reachable through vmbrWS so it replied out vmbrWS — even though the original packet came in through the 192.168.50.x network. The Watchguard firewall doesn't like to see this and it denied the response, breaking my WebUI and SSH access. In otherwords, it's not that I couldn't reach 192.168.50.43, it's that its response was dropped by the firewall.

So, I changed the CIDR for vmbrWS to a smaller net mask, which would eliminate my 10.10.10.123 device but allow access to lower 10.10.10.x IP like some servers and the NAS.

So with an IP/CIDR of 10.10.10.43/26, I finally have everything working. Note that with /26 instead of /24, the accessible range becomes 10.10.10.0 - 10.10.10.63.

I have read items regarding how to stop ARP from making this decision to send packets out a shorter path instead of where they came from, things like arp_ignore, arp_filer, and setting up custom routes, but I honestly couldn't get those items to work, and I've already spent a day on this issue.

Since ping and tracert use the stateless ICMP protocol, which doesn't utilize sessions like http and ssh, it wasn't subject to the firewall packet rejection.
 
Last edited:
@budney76, Just curious, what is the IP/CIDR you're assigning to the additional Linux Bridge in ProxMox, and what is the IP address of the device that can no longer see the ProxMox host on the original network?

Note that even though I allowed 'any' traffic between networks, it appears that my firewall was blocking the return traffic on a more global 'SYN' setting outside the policies I had defined. Can you check firewall logs to see if that's the actual cause of the rejection?
 
The subnet in question is 10.2.4.0/24 this is where the additional bridge is being added, it's also where my computer is. The main Proxmox bridge is on subnet 10.4.4.0/24. I didn't find anything being blocked in the logs.
 
Sorry if this is all redundant with what I've already said, or what you already know. Just taking a shot in the dark.

I found that changing my IP/CIDR on the Linux Bridge from 10.10.10.43/24 to 10.10.10.43/26 fixed my issue. Since my workstation was 10.10.10.123, with the original /24 subnet on the bridge I think ProxMox was trying to re-route (ARP) the traffic through the wrong NIC. By switching to the /26 subnet on the Linux Bridge, it no longer thought the IP was reachable, and returned traffic through the source NIC (on vmbr0, as it should).

If my workstation were in the lower range (10,10,10.0 to 10.10.10.63), then the /26 subnet mask would not have worked. So. to clarify my understanding for my work around (not even a fix) is this:
The IP/CIDR on your second Linux bridge has to be one that excludes the workstation IP that you're trying to access from.
My vmbr0 IP/CIDR was 192.168.50.43/24 (/24 allows access to 192.168.50.0 to 192.168.50.254)
My vmbrWS IP/CIDR is 10.10.10.43/26 (/26 is smaller, allows access to only 10.10.10.0 to 10.10.10.64)
Since my workstation is 10.10.10.123, I fall outside the direct-access that vmbrWS has, so packets that came in on vmbr0 go out vmbr0 as intended, and not vmbrWS.
The subnet mask is very specific to my setup, where my workstations get DHCP above 10.10.10.100 and servers and devices with static IPs are below 10.10.10.100.
But if I change my IP/CIDR on vmbrWS back to the /24 (from the now /26), the access breaks again.

I apologize if this is all just redundant ranting, and I hope it is somewhat helpful. I think this ARP short-cut routing of response packets going out through a different nic than request packets is a key issue.