Hi,
I’m running Proxmox VE 8.4.1 and I’ve been experiencing repeated network loss with kernel 6.8.12-11-pve. After running a large Docker backup job from a VM, the network on the host suddenly becomes unreachable (ping: “host is down”) and I have to physically reboot the server.
This only happens with kernel 6.8.12-11-pve. When I switch back to kernel 6.5.13-6-pve, the problem disappears entirely.
I’m using an Intel NIC with the e1000e driver. Here is the output of lspci -nn | grep -i eth:
During the failure, dmesg shows repeated messages like:
I also disabled GRO/TSO offloading, with no effect:
This issue started appearing shortly after recent kernel upgrades.
Here’s the output of pveversion -v:
Could this be a regression in the e1000e driver in the 6.8 kernel series?
Is there a recommended workaround or patch? I can stick to 6.5.13-6-pve for now, but I’d like to know if this is being tracked.
Thanks in advance!
I’m running Proxmox VE 8.4.1 and I’ve been experiencing repeated network loss with kernel 6.8.12-11-pve. After running a large Docker backup job from a VM, the network on the host suddenly becomes unreachable (ping: “host is down”) and I have to physically reboot the server.
This only happens with kernel 6.8.12-11-pve. When I switch back to kernel 6.5.13-6-pve, the problem disappears entirely.
I’m using an Intel NIC with the e1000e driver. Here is the output of lspci -nn | grep -i eth:
Bash:
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)
During the failure, dmesg shows repeated messages like:
Bash:
e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH, TDT, next_to_use, next_to_clean...
I also disabled GRO/TSO offloading, with no effect:
Bash:
/sbin/ethtool -K eno1 gro off gso off tso off
This issue started appearing shortly after recent kernel upgrades.
Here’s the output of pveversion -v:
Bash:
proxmox-ve: 8.4.0 (running kernel: 6.5.13-6-pve)
pve-manager: 8.4.1/2a5fa54a8503f96d
...
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
...
Could this be a regression in the e1000e driver in the 6.8 kernel series?
Is there a recommended workaround or patch? I can stick to 6.5.13-6-pve for now, but I’d like to know if this is being tracked.
Thanks in advance!