PVE8.4, VLAN and trunk bond

Saskha

New Member
Apr 9, 2025
1
0
1
Hello Team,

I'am new in Proxmox, so sorry for another interfaces bond thread.
I have Grandstream GWN7813P L2-L3 Switch, that supports LAG: static and LACP.

I have Server with 1 HIC (Host Interface Card) with 4 x 1Gb ports. I need to bond this 4 interfaces to achive logical 4Gb with LACP.
My config:

auto lo
iface lo inet loopback

auto ens4f0
iface ens4f0 inet manual

iface enx3a68dd6e1e87 inet manual

auto ens4f1
iface ens4f1 inet manual

auto ens4f2
iface ens4f2 inet manual

auto ens4f3
iface ens4f3 inet manual

auto bond0
iface bond0 inet manual
bond-slaves ens4f0 ens4f1 ens4f2 ens4f3
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3

iface bond0.136 inet manual

auto vmbr0
iface vmbr0 inet manual
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094

auto vmbr0v136
iface vmbr0v136 inet static
address 10.1.1.4/24
gateway 10.1.1.1
bridge_ports bond0.136
bridge_stp off

source /etc/network/interfaces.d/*

On switch I have VLAN 136. I had added 4 interfaces that connected to HIC in VLAN 136 trunk mode. At this moment I can reach Proxmox Server by 10.1.1.4 ip address.

When I create LAG with LACP on Grandstream switch - Proxmox anavailable, no ping to 10.1.1.4 and no access.

I tried examples from PVE manual, forum threads - but all start to broke when I creating LACP LAG on switch.

Can You, please, help? Where are my mistakes?
Thank you!
 
hi, proxmox cfg looks ok, i recommend to look for problem on switch.
you can also use/try other modes of bonding, that does not require switch support (no switch lacp cfg).
 
Subject: PVE 8.4 LXC Network Failure in Tagged VLAN on LACP Bond (ARP Issue) - Host Cannot Communicate via bondX.<VLAN_ID>

Hello Proxmox Community,

I'm facing a networking issue with LXC containers on several nodes after updating them from Proxmox VE 8.1.x to 8.4.x. The problem seems specific to tagged VLAN traffic over an LACP bond.

Environment:

Proxmox VE: 8.4.x (Problem occurred after update from 8.1.x where this setup worked)
Multiple nodes affected.
Network Setup:
2x Physical NICs configured in an LACP Bond (bond1, mode 802.3ad, MTU 9000).
Linux Bridge vmbr1 built over bond1, configured with bridge-vlan-aware yes.
Multiple tagged VLANs are used for VMs and LXCs on vmbr1.
Switch side: Corresponding LAG is configured with LACP, trunk mode, relevant VLANs allowed, MTU 9000.
Problem Description:

LXC containers configured with static IPs and attached to vmbr1 with a specific VLAN Tag (e.g., VLAN 2040) cannot reach their gateway.
Pinging the gateway from within these LXCs fails with Destination Host Unreachable, indicating an ARP resolution failure.
Crucially: KVM VMs configured on the same host, same bridge (vmbr1), and using the same problematic VLAN Tag (e.g., 2040) work correctly and can reach the gateway.
The issue started consistently after updating the nodes to PVE 8.4.
Host Network Configuration (/etc/network/interfaces):
(Showing relevant structure and the specific VLAN sub-interface/bridge used for testing)

JSON:
# --- बॉन्डिंग ---
auto bond1
iface bond1 inet manual
     bond-slaves <phys_nic1> <phys_nic2>
     bond-miimon 100
     bond-mode 802.3ad
     bond-xmit-hash-policy layer2+3
     mtu 9000

# --- VLAN Sub-interface Example (for the problematic VLAN) ---
iface bond1.<VLAN_ID> inet manual # e.g., iface bond1.2040 inet manual
     mtu 9000

# --- VLAN-aware Bridge ---
auto vmbr1
iface vmbr1 inet manual
     bridge-ports bond1
     bridge-stp off
     bridge-fd 0
     bridge-vlan-aware yes
     mtu 9000

# --- Attempted Workaround Bridge (also failed) ---
auto vmbr<VLAN_ID> # e.g., auto vmbr2040
iface vmbr<VLAN_ID> inet manual # e.g., iface vmbr2040 inet manual
     bridge-ports bond1.<VLAN_ID> # e.g., bridge-ports bond1.2040
     bridge-stp off
     bridge-fd 0
     mtu 9000

# (Other VLAN sub-interfaces and host management bridges/IPs omitted for brevity)

Troubleshooting Steps Taken:

Confirmed gateway is up and configured correctly for the relevant VLAN.
Confirmed switch LAG configuration (LACP active, trunk, specific problematic VLAN allowed on the LAG interface, MTU 9000).
Disabled PVE firewall for LXC network devices -> No change.
Restarted LXC containers -> No change.
Checked cat /sys/class/net/vmbr1/bridge/vlan_filtering -> Shows 1.
Workaround Attempt: Created a traditional bridge (vmbr<VLAN_ID>) over the specific VLAN sub-interface (bond1.<VLAN_ID>). Configured LXC to use this bridge with no VLAN tag. -> This also failed with the same ARP failure.
Host Connectivity Test (Focusing on the problematic VLAN, e.g., 2040):
Added temporary IP to bond1.<VLAN_ID> on the PVE host.
ping -I bond1.<VLAN_ID> <Gateway_IP> -> FAILED (100% packet loss).
Added temporary IP to the workaround bridge vmbr<VLAN_ID> on the PVE host.
ping -I vmbr<VLAN_ID> <Gateway_IP> -> FAILED (100% packet loss).
ping -I vmbr<VLAN_ID> <LXC_IP_in_same_VLAN> -> WORKED.
Conclusion:

The fundamental issue appears to be that the Proxmox VE host itself cannot communicate externally via its LACP bonded interface (bond1) within specific tagged VLANs after the update to PVE 8.4. This failure occurs even when testing directly via the bond1.<VLAN_ID> sub-interface. Since the host has no outbound L2/L3 connectivity in that VLAN via the bond, LXC containers using that path also fail. The fact that KVM VMs do work suggests a difference in how veth (LXC) vs tap (KVM) interfaces interact with the bond/VLAN stack in PVE 8.4, or that the underlying bond1.<VLAN_ID> failure is intermittent or conditional in a way that affects LXCs more readily.

Question:

Is anyone aware of regressions or known issues in Proxmox VE 8.4.x (or associated kernel / ifupdown2 / networking components) regarding tagged VLANs over LACP bonds, particularly ones that might affect LXC containers differently from KVM VMs or break host-level communication via VLAN sub-interfaces on bonds? Any advice on further diagnosing why the host fails to communicate via bond1.<VLAN_ID> would be greatly appreciated.

Thank you!
 
Last edited:
Subject: PVE 8.4 LXC Network Failure in Tagged VLAN on LACP Bond (ARP Issue) - Host Cannot Communicate via bondX.<VLAN_ID>

Hello Proxmox Community,

I'm facing a networking issue with LXC containers on several nodes after updating them from Proxmox VE 8.1.x to 8.4.x. The problem seems specific to tagged VLAN traffic over an LACP bond.

Environment:

Proxmox VE: 8.4.x (Problem occurred after update from 8.1.x where this setup worked)
Multiple nodes affected.
Network Setup:
2x Physical NICs configured in an LACP Bond (bond1, mode 802.3ad, MTU 9000).
Linux Bridge vmbr1 built over bond1, configured with bridge-vlan-aware yes.
Multiple tagged VLANs are used for VMs and LXCs on vmbr1.
Switch side: Corresponding LAG is configured with LACP, trunk mode, relevant VLANs allowed, MTU 9000.
Problem Description:

LXC containers configured with static IPs and attached to vmbr1 with a specific VLAN Tag (e.g., VLAN 2040) cannot reach their gateway.
Pinging the gateway from within these LXCs fails with Destination Host Unreachable, indicating an ARP resolution failure.
Crucially: KVM VMs configured on the same host, same bridge (vmbr1), and using the same problematic VLAN Tag (e.g., 2040) work correctly and can reach the gateway.
The issue started consistently after updating the nodes to PVE 8.4.
Host Network Configuration (/etc/network/interfaces):
(Showing relevant structure and the specific VLAN sub-interface/bridge used for testing)

JSON:
# --- बॉन्डिंग ---
auto bond1
iface bond1 inet manual
     bond-slaves <phys_nic1> <phys_nic2>
     bond-miimon 100
     bond-mode 802.3ad
     bond-xmit-hash-policy layer2+3
     mtu 9000

# --- VLAN Sub-interface Example (for the problematic VLAN) ---
iface bond1.<VLAN_ID> inet manual # e.g., iface bond1.2040 inet manual
     mtu 9000

# --- VLAN-aware Bridge ---
auto vmbr1
iface vmbr1 inet manual
     bridge-ports bond1
     bridge-stp off
     bridge-fd 0
     bridge-vlan-aware yes
     mtu 9000

# --- Attempted Workaround Bridge (also failed) ---
auto vmbr<VLAN_ID> # e.g., auto vmbr2040
iface vmbr<VLAN_ID> inet manual # e.g., iface vmbr2040 inet manual
     bridge-ports bond1.<VLAN_ID> # e.g., bridge-ports bond1.2040
     bridge-stp off
     bridge-fd 0
     mtu 9000

# (Other VLAN sub-interfaces and host management bridges/IPs omitted for brevity)

Troubleshooting Steps Taken:

Confirmed gateway is up and configured correctly for the relevant VLAN.
Confirmed switch LAG configuration (LACP active, trunk, specific problematic VLAN allowed on the LAG interface, MTU 9000).
Disabled PVE firewall for LXC network devices -> No change.
Restarted LXC containers -> No change.
Checked cat /sys/class/net/vmbr1/bridge/vlan_filtering -> Shows 1.
Workaround Attempt: Created a traditional bridge (vmbr<VLAN_ID>) over the specific VLAN sub-interface (bond1.<VLAN_ID>). Configured LXC to use this bridge with no VLAN tag. -> This also failed with the same ARP failure.
Host Connectivity Test (Focusing on the problematic VLAN, e.g., 2040):
Added temporary IP to bond1.<VLAN_ID> on the PVE host.
ping -I bond1.<VLAN_ID> <Gateway_IP> -> FAILED (100% packet loss).
Added temporary IP to the workaround bridge vmbr<VLAN_ID> on the PVE host.
ping -I vmbr<VLAN_ID> <Gateway_IP> -> FAILED (100% packet loss).
ping -I vmbr<VLAN_ID> <LXC_IP_in_same_VLAN> -> WORKED.
Conclusion:

The fundamental issue appears to be that the Proxmox VE host itself cannot communicate externally via its LACP bonded interface (bond1) within specific tagged VLANs after the update to PVE 8.4. This failure occurs even when testing directly via the bond1.<VLAN_ID> sub-interface. Since the host has no outbound L2/L3 connectivity in that VLAN via the bond, LXC containers using that path also fail. The fact that KVM VMs do work suggests a difference in how veth (LXC) vs tap (KVM) interfaces interact with the bond/VLAN stack in PVE 8.4, or that the underlying bond1.<VLAN_ID> failure is intermittent or conditional in a way that affects LXCs more readily.

Question:

Is anyone aware of regressions or known issues in Proxmox VE 8.4.x (or associated kernel / ifupdown2 / networking components) regarding tagged VLANs over LACP bonds, particularly ones that might affect LXC containers differently from KVM VMs or break host-level communication via VLAN sub-interfaces on bonds? Any advice on further diagnosing why the host fails to communicate via bond1.<VLAN_ID> would be greatly appreciated.

Thank you!

VLAN issue update in PVE 8.4 – not only containers, but also VMs affected​


This is a follow-up to my previous post regarding network issues with VLAN-tagged bridges.


Context:​


The issue affects both LXC containers and KVM virtual machines.
Even though IP addresses are assigned, the network is not reachable (e.g. gateway unreachable).


What I did:​


  • Installed exact same PVE 8.4 ISO on a regular home PC.
  • Connected it to the same switch as my other production nodes.
  • Set up identical network configuration:
    • bond1 with VLAN-tagged bridge (vmbr1)
    • VLAN tag (2040)
    • Same Intel X520 NIC
  • Tried both with and without bridge-vlan-aware yes — no change.

✅ Results:​


  • On the home PC, both LXC and VM network work fine.
  • On SuperMicro / HPE serverssame config, but no network access from VMs or containers.
  • /etc/network/interfaces and bridge settings are identical.

Conclusion:​


It appears that newer versions of Proxmox/kernel introduced a change in behavior related to VLANs or bonding.
This may be related to how certain hardware handles VLAN tags, especially in bonding mode.


Any input or ideas are appreciated. I can provide full configs (interfaces, bridge vlan show, etc.) if needed.