SDN configuration isn't working for one host in datacenter

Jun 3, 2025
4
0
1
Hello, apologies if I miss anything obvious, haven't had to dig this far into things yet and am still finding my way around Proxmox.
I have a datacenter with three hosts. SDN is configured with a really basic setup. No firewalls are active anywhere. Two hosts work fine, all VLANs are accessible for guests. The third is not. The host is otherwise entirely functional; I can migrate VMs on and off it, access works (out of band from the rest of the NICs as well as in-band), I can update it, etc. VMs are unable to communicate to anything, though. Not even other VMs on the same host in the same VLAN. I have Windows and Linux guests, tried all flavors of NICs. If I assign the VM a NIC on the out-of-band management network (not handled by SDN), I am able to communicate as expected. I've confirmed spanning-tree isn't blocking. It's only the 116/120 VLANs, everything else seems to be working (I haven't tried building additional VLANs, as I don't think that's going to really prove anything we can't already see or interpret from looking at configurations).
All of that makes me think it's a switching issue, but I have checked the switches and see nothing wrong or different from the other interfaces. I've also stared at this for way too long, so my hope is that I'm missing something stupid and just need another set of eyes on it. I've attached the config from our switches; they are Dell S5248F, running VLT. Showing config for the ports in question (e1/1/5, po105) as well as a working host (e1/1/3, po103). Host has two NICs, connected to e1/1/5 on each. Port-channel 105/103 aggregates them.

Let me know if I missed anything or if there are any questions I can answer. Appreciate the help.

Contents of /etc/pve/sdn on the host in question are below (this is identical on working hosts):
JSON:
# cat pve-ipam-state.json
{
    "zones":{
        "zone1":{
            "subnets":{
                "10.128.20.0/23":{
                    "ips":{
                        "10.128.20.10":{
                            "gateway":1
                        }
                    }
                },
                "10.128.16.0/23":{
                    "ips":{
                        "10.128.16.10":{
                            "gateway":1
                        }
                    }
                }
            }
        }
    }
}

Code:
# cat subnets.cfg
subnet: zone1-10.128.16.0-23
        vnet vnet116
        gateway 10.128.16.10
subnet: zone1-10.128.20.0-23
        vnet vnet120
        gateway 10.128.20.10

Code:
# cat vnets.cfg
vnet: vnet116
        zone zone1
        alias Applications/Guests
        tag 116

vnet: vnet120
        zone zone1
        alias Network Service
        tag 120

vnet: vnet180
        zone zone1
        alias vMotion/vSAN
        tag 180

Code:
# cat zones.cfg
vlan: zone1
        bridge sdnbr0
        ipam pve

Interfaces on the host in question:
Code:
# ip link show up
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno8303: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether c4:cb:e1:a0:bd:24 brd ff:ff:ff:ff:ff:ff
    altname enp99s0f0
6: ens3f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    altname enp100s0f0np0
7: ens3f1np1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff permaddr 00:62:0b:ca:59:41
    altname enp100s0f1np1
9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master sdnbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
10: bond0.116@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
11: bond0.120@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
12: bond0.180@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
13: bond0.182@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
14: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether c4:cb:e1:a0:bd:24 brd ff:ff:ff:ff:ff:ff
15: sdnbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
16: sdnbr0.116@sdnbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vnet116 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
17: vnet116: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    alias Applications/Guests
18: sdnbr0.120@sdnbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vnet120 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
19: vnet120: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    alias Network Service
20: sdnbr0.180@sdnbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vnet180 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
21: vnet180: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    alias vMotion/vSAN
27: tap102i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vnet120 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 12:94:30:f7:9e:b6 brd ff:ff:ff:ff:ff:ff
30: tap106i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vnet116 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 6e:64:ea:7b:aa:1b brd ff:ff:ff:ff:ff:ff
31: tap102i1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vnet116 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 12:aa:c2:fc:28:f8 brd ff:ff:ff:ff:ff:ff
39: tap101i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vnet116 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether f2:6e:ef:cb:ca:3c brd ff:ff:ff:ff:ff:ff

/etc/network/interfaces from the host in question:
Code:
# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno8303 inet manual
iface eno8403 inet manual
iface eno12399np0 inet manual
iface eno12409np1 inet manual
iface idrac inet manual
auto ens3f0np0
iface ens3f0np0 inet manual
auto ens3f1np1
iface ens3f1np1 inet manual
auto bond0
iface bond0 inet manual
        bond-slaves ens3f0np0 ens3f1np1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2
auto bond0.116
iface bond0.116 inet manual
auto bond0.120
iface bond0.120 inet manual
auto bond0.180
iface bond0.180 inet manual
auto bond0.182
iface bond0.182 inet static
        address 10.128.80.35/27
auto sdnbr0
iface sdnbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
auto vmbr0
iface vmbr0 inet static
        address 10.128.4.34/25
        gateway 10.128.4.10
        bridge-ports eno8303
        bridge-stp off
        bridge-fd 0
source /etc/network/interfaces.d/*
And the same from a working host:
Code:
# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno8303 inet manual
iface eno8403 inet manual
#Gigabit ethernet 02
iface eno12399np0 inet manual
#Lower 10/25G NIC 01
iface eno12409np1 inet manual
#Lower 10/25G NIC 01
iface idrac inet manual
auto ens3f0np0
iface ens3f0np0 inet manual
        mtu 1532
auto ens3f1np1
iface ens3f1np1 inet manual
        mtu 1532
auto bond0
iface bond0 inet manual
        bond-slaves ens3f0np0 ens3f1np1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2
        mtu 1532
auto bond0.116
iface bond0.116 inet manual
auto bond0.120
iface bond0.120 inet manual
auto bond0.180
iface bond0.180 inet manual
auto bond0.182
iface bond0.182 inet static
        address 10.128.80.33/27
auto vmbr0
iface vmbr0 inet static
        address 10.128.4.32/25
        gateway 10.128.4.10
        bridge-ports eno8303
        bridge-stp off
        bridge-fd 0
        mtu 1500
auto sdnbr0
iface sdnbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
source /etc/network/interfaces.d/*
 

Attachments

I'm having exactly the same issue since a few months ago. It happens every so often.
I'm sure that host switching is not the cause. Some VMs work, some not. It is not OS dependant as it happens to windows and linux machines without distinction. Sometimes happens after a guest reboot.
In that host, guests just stop receiving packets after some time. migrating the guest to another host solves the issue as long as the guest remains in any of the other two hosts.

I did everything I could. Even restored the whole SDN and rebuilt everything. I don't know.
 
This is consistent, or at least I haven't seen it not happening since noticing it. This is a new cluster of brand-new Dell R7625 machines. Only thing I can think is that there's something mis-cabled at this point, but I can't find any evidence of that on either the switches or host(s). I did notice that the VMs are in the mac address-table of the switch. No ACLs are in use.
 
Created a new bridge containing the VLAN I was trying to bridge to using SDN. Moved the VM to it, everything works fine. Definitely has something to do with SDN, just not sure how to troubleshoot it.
 
I will delete and re build the node to see if something human-made (by me, ofc) caused the issue. In my case the setup is identical in all nodes.

I'm pretty sure that @spirit might know something about this. Man, if you are in the mood and with some free time to shine a light on this, you're welcome.
 
Last edited:
In that case, oftentimes a VLAN defined on the same physical interface that the bridge of the VLAN zone is using is the cause of the issue.

For example, from OP's config:

There is VLAN 116 defined on bond0:

Code:
auto bond0.116
iface bond0.116 inet manual

But there's also a vnet with tag 116 on the zone using sdnbr0:
Code:
# cat zones.cfg
vlan: zone1
        bridge sdnbr0
        ipam pve

vnet: vnet116
        zone zone1
        alias Applications/Guests
        tag 116

This causes two VLAN interfaces to be created:
Code:
10: bond0.116@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
[...]
16: sdnbr0.116@sdnbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vnet116 state UP mode DEFAULT group default qlen 1000
    link/ether 00:62:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff


This can have weird side effects where the bond0.116 interface essentially "blackholes" the traffic for VLAN 116 on bridge sdnbr0. If one wants to configure an IP, then it should be configured on sdnbr0.116 instead of on bond0.116. If the directive is empty, as is the case in OP's example, then it doesn't really serve a purpose and can be omitted altogether.