Proxmoxcluster with Mellanox ConnectX-4 Lx networkcards:
Worked under kernel 5.11, everything fine so far. After Upgrade to newest kernel 5.13 massive problems on networking. Impossible to dump sfp infos with ethtool, got bit errors massive problems with local ceph instance installed and so on. Reverting back to 5.11 Kernel everything seems to be fine again…..
# ethtool -i enp1s0f1np1
driver: mlx5_core
version: 5.13.19-4-pve
firmware-version: 14.29.1016 (MT_2420110004)
expansion-rom-version:
bus-info: 0000:01:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-4-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-7
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-6
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
A quick look at the corresponding network drivers at the different kernel versions for this card:
lib/modules/5.11.22-7-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2309448 Nov 7 21:46 mlx5_core.ko
lib/modules/5.13.19-4-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2060120 Feb 7 11:01 mlx5_core.ko
tmp/kernel-ausgepackt/5.15.19-2/lib/modules/5.15.19-2-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2522672 Feb 8 11:19 mlx5_core.ko
Ubuntu native Kernel:
tmp/kernel-ausgepackt/ubuntu-5.13.0-30/lib/modules/5.13.0-30-generic/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2652337 Feb 4 17:40 mlx5_core.ko
For me that looks a little bit strange. Does anyone else have similar problems with newest kernel and mellanox cards? Any explanations....
Worked under kernel 5.11, everything fine so far. After Upgrade to newest kernel 5.13 massive problems on networking. Impossible to dump sfp infos with ethtool, got bit errors massive problems with local ceph instance installed and so on. Reverting back to 5.11 Kernel everything seems to be fine again…..
# ethtool -i enp1s0f1np1
driver: mlx5_core
version: 5.13.19-4-pve
firmware-version: 14.29.1016 (MT_2420110004)
expansion-rom-version:
bus-info: 0000:01:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-4-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-7
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-6
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
A quick look at the corresponding network drivers at the different kernel versions for this card:
lib/modules/5.11.22-7-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2309448 Nov 7 21:46 mlx5_core.ko
lib/modules/5.13.19-4-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2060120 Feb 7 11:01 mlx5_core.ko
tmp/kernel-ausgepackt/5.15.19-2/lib/modules/5.15.19-2-pve/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2522672 Feb 8 11:19 mlx5_core.ko
Ubuntu native Kernel:
tmp/kernel-ausgepackt/ubuntu-5.13.0-30/lib/modules/5.13.0-30-generic/kernel/drivers/net/ethernet/mellanox/mlx5/core# ls -la
-rw-r--r-- 1 root root 2652337 Feb 4 17:40 mlx5_core.ko
For me that looks a little bit strange. Does anyone else have similar problems with newest kernel and mellanox cards? Any explanations....