kernel 5.13.19-4-pve breaks e1000e networking

NdK73 · Apr 15, 2022

Hello all.

I'm facing a strange problem.
I have a cluster of 9 Proxmox 7.1-12 servers on identical HW (old Asus servers). Yesterday I rebooted the last 3 nodes I added recently and only one returned online.
The other two rebooted, but failed to bring up networking.

If I connect "locally" (IPMI KVM, actually) and issue:
# ip link set enp2s0 up
# ip link set enp3s0 up
# systemctl restart networking
the network comes up (so it seems the config files are OK). But restarting the networking service is not enough: the two "ip link set" lines are needed.

I tried booting various kernels and the last working one seems to be 5.13.19-2-pve. Didn't notice anything suspicious in release notes... Is it a regression?

Any hint appreciated.

Tks.

mira · Apr 15, 2022

Are you using the latest firmware for those NICs?
It could be that changes in the driver require a newer firmware version.

NdK73 · Apr 17, 2022

I'm using the Proxmox-supplied packages, keeping the system up-to-date.
IIUC Intel e1000e never required firmware packages: they're quite old and stable cards and I can't find neither fw nor updater on Intel's site, only older drivers, with the note that newest drivers are already included in the mainline kernel. So I wouldn't even know how and what to flash...

mira · Apr 19, 2022

Then you could try the latest 5.15 kernel we provide: https://forum.proxmox.com/threads/opt-in-linux-kernel-5-15-for-proxmox-ve-7-x-available.100936/

If it still doesn't work, please provide both the dmesg output (dmesg > dmesg.txt) as well as the journal since the last boot (of a boot with those issues): journalctl -b > journal.txt

NdK73 · Apr 19, 2022

Even with the latest 5.15 kernel it does not work

Attached both working (-ok) and not-working (-bad) versions of the requested files.
Hope it's easily fixable

Tks.

mira · Apr 19, 2022

The issue is not the NIC, but rather the disk /dev/sdc:

Code:

Apr 19 10:24:44 virt9 systemd-udevd[673]: sdc: Spawned process '/usr/bin/systemd-run /sbin/lvm pvscan --cache 8:32' [793] is taking longer than 58s to complete
Apr 19 10:24:44 virt9 systemd-udevd[629]: sdc: Worker [673] processing SEQNUM=3684 is taking a long time
Apr 19 10:25:41 virt9 systemd[1]: ifupdown2-pre.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 10:25:41 virt9 systemd[1]: ifupdown2-pre.service: Failed with result 'exit-code'.
Apr 19 10:25:41 virt9 systemd[1]: Failed to start Helper to synchronize boot up for ifupdown.
Apr 19 10:25:41 virt9 systemd[1]: Dependency failed for Network initialization.

This leads ifupdown2-pre.service to fail, because of a dependency.
Check the disk for any issues.

NdK73 · Apr 19, 2022

sdc is part of a multipath device (an old CX3-80 connected via FC, two 4Gbps fibers):

Code:

root@virt9:~# multipath -ll
mp_CX3_dr (360060160c0251c001aa0979f088bec11) dm-7 DGC,RAID 5
size=39T features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 4:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  `- 5:0:0:0 sdc 8:32 active ready running
mp_MD3800i (3600a0980005854ba000003ae54f0a0d7) dm-8 DELL,MD38xxi
size=33T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
|-+- policy='service-time 0' prio=14 status=active
| |- 6:0:0:0 sdg 8:96 active ready running
| `- 9:0:0:0 sde 8:64 active ready running
`-+- policy='service-time 0' prio=9 status=enabled
  |- 7:0:0:0 sdf 8:80 active ready running
  `- 8:0:0:0 sdd 8:48 active ready running

I'm quite sure it's unrelated, but I'm still trying to understand why LVM uses sd[bc] instead of /dev/mapper/mp_CX3_dr even if I explicitly blacklisted /dev/[hsz]d.* in /etc/lvm/lvm.conf:

Code:

 global_filter = [ "a|/dev/disk/by-path/pci-0000\:00\:1f.2-ata-.*-part[0-9]*|", "r|/dev/[shz]d.*|", "r|/dev/mapper/pve-.*|", "r|/dev/mapper/.*-vm--[0-9]+--disk--[0-9]+|" ]

Obviously if the network doesn't start, mp_MD3800i can't start since it's iSCSI.

Forgot to add that even on the working kernel I get:

Code:

root@virt9:~# pvs                                                                                                                                                                                                         WARNING: Device mismatch detected for CX3_DR/vm-112-disk-0 which is accessing /dev/sdb instead of /dev/mapper/mp_CX3_dr.
WARNING: Device mismatch detected for CX3_DR/vm-150-disk-0 which is accessing /dev/sdb instead of /dev/mapper/mp_CX3_dr.
PV                     VG          Fmt  Attr PSize    PFree
/dev/mapper/mp_CX3_dr  CX3_DR      lvm2 a--   <39.41t 39.25t
/dev/mapper/mp_MD3800i Databox2_r6 lvm2 a--    32.74t 16.19t
/dev/sda3              pve         lvm2 a--  <297.59g 15.99g

mira · Apr 19, 2022

The network service depends on udev to make sure the devices are actually initialized before trying to start the network.
But if a storage keeps it from finishing, then services that depend on it will fail.

You could see if your QLogic HBA has newer firmware available, maybe it doesn't work as well with changes in newer kernel versions.

NdK73 · Apr 19, 2022

Looked around for a bit, but it seems I can't find a firmware for those cards

Maybe they're too old... Tomorrow I'll look again and will possibly try replacing 'em with newer ones (sigh... having to reconfigure mappings from the Clariion interface... bleach!

).

NdK73 · Apr 20, 2022

Still no updated fw, but the problem is not that the device does not work, it's that udevd launches pvscan on the raw device of the passive path, that doesn't respond (it's passive!).
That's also the reason I blacklisted all disk devices ("r|/dev/.d.*|") in global_filter line of /etc/lvm/lvm.conf . Why does systemd run "/sbin/lvm pvscan --cache 8:32" forcing it to access a device known not to answer (8:32 is /dev/sdc)? If I try to run the same command manually, I immediately get "/dev/sdc excluded by filters: device is rejected by filter config" as expected.
If only I could increase the timeout, after about 40 seconds the multipath device is initialized and pvscan can detect the volume using /dev/mapper/ device... But the real fix would be lvm honoring the filters during udevd scan...

mira · Apr 20, 2022

It seems that is by design: https://bugzilla.redhat.com/show_bug.cgi?id=1149266#c3 (maybe the behavior has changed since then)
It is first scanned and then compared to the global_filter.

NdK73 · Apr 20, 2022

Urgh. Missed that. 1) and 2) should probably be swapped...
Given that changing the FW would not remove the cause of the problem (lvm scanning a not-responding device), what else can I do? I'm definitely out of ideas

mira · Apr 20, 2022

For now I'd suggest running the older kernel version, even though some (security) bug fixes might be missing.
Could you set it up so that the disk is accessible via both links, rather than having the second one just as backup?

NdK73 · Apr 20, 2022

Nope. It's quite an old storage system (nearly 20yo!) and the two controllers work in active/passive mode. The alternative would be to sacrifice the multipath...

Tks anyway. Hope I'll be able to replace the power-hungry CX3 with a more energy-efficient system soon.

mira · Apr 20, 2022

I'll see if I can find anything in the kernel commit log which would explain this sudden change in behavior.

bbgeek17 · Apr 20, 2022

Perhaps adjusting Requires/After as described here might help:
https://unix.stackexchange.com/ques...te-lvm-partition-on-multipath-during-the-boot

Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

kernel 5.13.19-4-pve breaks e1000e networking

NdK73

Renowned Member

mira

Proxmox Staff Member

NdK73

Renowned Member

mira

Proxmox Staff Member

NdK73

Renowned Member

Attachments

mira

Proxmox Staff Member

NdK73

Renowned Member

mira

Proxmox Staff Member

NdK73

Renowned Member

NdK73

Renowned Member

mira

Proxmox Staff Member

NdK73

Renowned Member

mira

Proxmox Staff Member

NdK73

Renowned Member

mira

Proxmox Staff Member

bbgeek17

Distinguished Member