Solved: PVE no network after upgrade from 6.4 to version 7.2.1 network card : Mellanox MT27710

lemlepmx

New Member
May 13, 2021
5
0
1
Hello,

We did 5 nodes upgrade this weekend (one cluster), only one machine has this type of nic.
The nic firmware upgrade was done to the latest.

Device #1:
----------

Device Type: ConnectX4LX
Part Number: MCX4121A-ACA_Ax
Description: ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
PSID: MT_2420110034
PCI Device Name: 0000:01:00.0
Base MAC: 043f72d4563c
Versions: Current Available
FW 14.32.1010 14.32.1010
PXE 3.6.0502 3.6.0502
UEFI 14.25.0017 14.25.0017

Status: Up to date

We can see the ingress traffic with tcpdump (vmbr10 is a mgmt/corosync interface, 172.16.2.240 is host address)


tcpdump -nnpi vmbr10
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vmbr10, link-type EN10MB (Ethernet), snapshot length 262144 bytes
01:35:29.869330 ARP, Request who-has 172.16.2.244 tell 172.16.2.240, length 28
01:35:29.869339 ARP, Request who-has 172.16.2.241 tell 172.16.2.240, length 28

The card is connected with DAC cables.

If I add ip address directly to the card not via bridge interface (ip addr delete ip dev vmbr10 / ip link set vmbr10 down / brctl delbr vmbr10 -> ip addr add to nic ) the IP communication vill recover, but proxmox processes cannot run in that case correctly (thats normal due the missing bridge interface).
Ifupdown2 is already installed, ifupdown is removed.
We tried to boot older kernel version like 5.4 but the situation is same.
Any suggestion ?
 
Yeah, interface name was changed, ip addr show exactly same name, ip, etc as it neccesary to working ip connectivity.
 
that's strange, I have a lot of connect-x4 lx in production, and I don't have any problem.

do you use vlan-aware bridge ?
Yes, the vlan function is necessary. I tried to limit the vlan id range to 2-512, but not solved.
And also tried without vlan-aware, to operate this node as a generic cluster member but no solution.
auto vmbr10
iface vmbr10 inet static
address 172.16.2.240/24
bridge-ports enp1s0f1np1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-512

and from configuration points works as expected:

Jun 20 02:31:57 rs500 kernel: [ 10.211483] vmbr10: port 1(enp1s0f1np1) entered blocking state
Jun 20 02:31:57 rs500 kernel: [ 10.211486] vmbr10: port 1(enp1s0f1np1) entered disabled state
Jun 20 02:31:59 rs500 kernel: [ 12.119903] vmbr10: port 1(enp1s0f1np1) entered blocking state
Jun 20 02:31:59 rs500 kernel: [ 12.119908] vmbr10: port 1(enp1s0f1np1) entered forwarding state

My question in general - and sorry it is my knowledge limitation - what could block the communication between bridge interface and other parts of IP stack inside in same kernel?
Is there any component that I forget to check ?
Iptables had a default accept policy and no other config statements.
 
Last edited:
that's strange, I have a lot of connect-x4 lx in production, and I don't have any problem.

do you use vlan-aware bridge ?
Could you please share some details of working situation with this card like kernel version, ethtool settings or anything that may help to identify root cause ? BRu
 
Postscript if somebody is interested.
Now it's solved. Long story. After two day sw troubleshooting via kvm I can't find a solution. Then the network card was replaced an other vendor dual 10G card and the situation was same. Then I focused again to the network configuration, reinstalled some of packages related to networking like bridge-utility, ifenslave, etc. Next I removed the vlan config from vmbr10, restart networking and voila, worked again. After that all removed networking configuration and feature was switched on again and stil working.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!