SOLUTION: Updating the firmware to 14.32.xxxx worked and fixed the issue simply.
Hello,
we have a very strange issue we can't solve ourselve for days now. One of our Dell PowerEdge R640 has a ConnectX4-LX Dual-Port SFP28 card (MCX422A-ACAA ConnectX-4 Lx EN Dual Port SFP28; 25GbE for Dell rack NDC) w. 14.14.2320 Firmware.
We're currently using one port (10G SFP+) with a untagged/tagged vlan setup (setup see below).
When running this on 5.13 kernel, everything works as expected. The VMs use the vmbr10/20/21 bridges that are assigned to the "vlanned-bonds" that run on top of bond0. I know, this setup is antique but it works fine also on 5 other nodes (with different NICs though).
I even tried disabling rx-vlan-filter and tx/rx-vlan-offload with pre-up ethtool (commented out in config below). This also does not work. The host itself on bond0 (untagged) works fine but all VMs on the other vmbr other than vmbr0 do not work.
With tcpdump I can see the DHCP request going out (20:47:39.497927 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:01:15:07 (oui Unknown), length 300) and I even see our Firewall receiving this and also sending a DHCP offer back which I can not see in tcpdump so it seems it never reaches the NIC or at least is filtered out. Interestingly again, older 5.13 kernel this setup works fine.
Does anyone have a clue? I also tried FW update using mlxup (can't find an update) nor does any of the Dell DUP packages (for Win/RHEL) seem to work on iDRAC update page. Also Lifecycle Controller does not show me any update. Not sure if it's a FW issue but any help here is appreciated!
/etc/network/interfaces
auto lo
iface lo inet loopback
auto eno2np1
iface eno2np1 inet manual # this is the ConnectX4-LX using mlx5 driver
pre-up ethtool -K $IFACE rx-vlan-filter off
pre-up ethtool -K $IFACE rx-vlan-offload off
pre-up ethtool -K $IFACE tx-vlan-offload off
bond-master bond0
iface enp101s0 inet manual #This is also a Mellanox but ConnectX3 and this works with both kernels with mlx4 driver
#bond-master bond0
auto bond0
iface bond0 inet manual
bond-slaves eno2np1
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
bond-downdelay 200
bond-updelay 200
auto bond0.11
iface bond0.11 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.20
iface bond0.20 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.21
iface bond0.21 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.22
iface bond0.22 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.30
iface bond0.30 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.40
iface bond0.40 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.50
iface bond0.50 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.60
iface bond0.60 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.70
iface bond0.70 inet manual
vlan_raw_device bond0
post-up bond0
auto vmbr0
iface vmbr0 inet static
address 10.10.10.36/24
gateway 10.10.10.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
auto vmbr11
iface vmbr11 inet manual
bridge-ports bond0.11
bridge-stp off
bridge-fd 0
auto vmbr20
iface vmbr20 inet manual
bridge-ports bond0.20
bridge-stp off
bridge-fd 0
auto vmbr21
iface vmbr21 inet manual
bridge-ports bond0.21
bridge-stp off
bridge-fd 0
auto vmbr22
iface vmbr22 inet manual
bridge-ports bond0.22
bridge-stp off
bridge-fd 0
auto vmbr30
iface vmbr30 inet manual
bridge-ports bond0.30
bridge-stp off
bridge-fd 0
auto vmbr40
iface vmbr40 inet manual
bridge-ports bond0.40
bridge-stp off
bridge-fd 0
auto vmbr50
iface vmbr50 inet manual
bridge-ports bond0.50
bridge-stp off
bridge-fd 0
auto vmbr60
iface vmbr60 inet manual
bridge-ports bond0.60
bridge-stp off
bridge-fd 0
auto vmbr70
iface vmbr70 inet manual
bridge-ports bond0.70
bridge-stp off
bridge-fd 0
Hello,
we have a very strange issue we can't solve ourselve for days now. One of our Dell PowerEdge R640 has a ConnectX4-LX Dual-Port SFP28 card (MCX422A-ACAA ConnectX-4 Lx EN Dual Port SFP28; 25GbE for Dell rack NDC) w. 14.14.2320 Firmware.
We're currently using one port (10G SFP+) with a untagged/tagged vlan setup (setup see below).
When running this on 5.13 kernel, everything works as expected. The VMs use the vmbr10/20/21 bridges that are assigned to the "vlanned-bonds" that run on top of bond0. I know, this setup is antique but it works fine also on 5 other nodes (with different NICs though).
I even tried disabling rx-vlan-filter and tx/rx-vlan-offload with pre-up ethtool (commented out in config below). This also does not work. The host itself on bond0 (untagged) works fine but all VMs on the other vmbr other than vmbr0 do not work.
With tcpdump I can see the DHCP request going out (20:47:39.497927 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:01:15:07 (oui Unknown), length 300) and I even see our Firewall receiving this and also sending a DHCP offer back which I can not see in tcpdump so it seems it never reaches the NIC or at least is filtered out. Interestingly again, older 5.13 kernel this setup works fine.
Does anyone have a clue? I also tried FW update using mlxup (can't find an update) nor does any of the Dell DUP packages (for Win/RHEL) seem to work on iDRAC update page. Also Lifecycle Controller does not show me any update. Not sure if it's a FW issue but any help here is appreciated!
/etc/network/interfaces
auto lo
iface lo inet loopback
auto eno2np1
iface eno2np1 inet manual # this is the ConnectX4-LX using mlx5 driver
pre-up ethtool -K $IFACE rx-vlan-filter off
pre-up ethtool -K $IFACE rx-vlan-offload off
pre-up ethtool -K $IFACE tx-vlan-offload off
bond-master bond0
iface enp101s0 inet manual #This is also a Mellanox but ConnectX3 and this works with both kernels with mlx4 driver
#bond-master bond0
auto bond0
iface bond0 inet manual
bond-slaves eno2np1
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
bond-downdelay 200
bond-updelay 200
auto bond0.11
iface bond0.11 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.20
iface bond0.20 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.21
iface bond0.21 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.22
iface bond0.22 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.30
iface bond0.30 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.40
iface bond0.40 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.50
iface bond0.50 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.60
iface bond0.60 inet manual
vlan_raw_device bond0
post-up bond0
auto bond0.70
iface bond0.70 inet manual
vlan_raw_device bond0
post-up bond0
auto vmbr0
iface vmbr0 inet static
address 10.10.10.36/24
gateway 10.10.10.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
auto vmbr11
iface vmbr11 inet manual
bridge-ports bond0.11
bridge-stp off
bridge-fd 0
auto vmbr20
iface vmbr20 inet manual
bridge-ports bond0.20
bridge-stp off
bridge-fd 0
auto vmbr21
iface vmbr21 inet manual
bridge-ports bond0.21
bridge-stp off
bridge-fd 0
auto vmbr22
iface vmbr22 inet manual
bridge-ports bond0.22
bridge-stp off
bridge-fd 0
auto vmbr30
iface vmbr30 inet manual
bridge-ports bond0.30
bridge-stp off
bridge-fd 0
auto vmbr40
iface vmbr40 inet manual
bridge-ports bond0.40
bridge-stp off
bridge-fd 0
auto vmbr50
iface vmbr50 inet manual
bridge-ports bond0.50
bridge-stp off
bridge-fd 0
auto vmbr60
iface vmbr60 inet manual
bridge-ports bond0.60
bridge-stp off
bridge-fd 0
auto vmbr70
iface vmbr70 inet manual
bridge-ports bond0.70
bridge-stp off
bridge-fd 0
Last edited: