Network failure with kernel 6.8.12-11-pve and Intel NIC (e1000e): host unreachable after backup job

heliospeed

New Member
Jul 10, 2025
3
0
1
Hi,

I’m running Proxmox VE 8.4.1 and I’ve been experiencing repeated network loss with kernel 6.8.12-11-pve. After running a large Docker backup job from a VM, the network on the host suddenly becomes unreachable (ping: “host is down”) and I have to physically reboot the server.

This only happens with kernel 6.8.12-11-pve. When I switch back to kernel 6.5.13-6-pve, the problem disappears entirely.

I’m using an Intel NIC with the e1000e driver. Here is the output of lspci -nn | grep -i eth:
Bash:
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)

During the failure, dmesg shows repeated messages like:
Bash:
e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
  TDH, TDT, next_to_use, next_to_clean...

I also disabled GRO/TSO offloading, with no effect:
Bash:
/sbin/ethtool -K eno1 gro off gso off tso off

This issue started appearing shortly after recent kernel upgrades.

Here’s the output of pveversion -v:
Bash:
proxmox-ve: 8.4.0 (running kernel: 6.5.13-6-pve)
pve-manager: 8.4.1/2a5fa54a8503f96d
...
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
...

Could this be a regression in the e1000e driver in the 6.8 kernel series?

Is there a recommended workaround or patch? I can stick to 6.5.13-6-pve for now, but I’d like to know if this is being tracked.

Thanks in advance!
 
Hello heliospeed,

Also had this start happening (Without a backup run on my Home Lab too.

Jul 09 21:08:07 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <5e>
TDT <e4>
next_to_use <e4>
next_to_clean <5d>
buffer_info[next_to_clean]:
time_stamp <1002f2c10>
next_to_watch <5e>
jiffies <1002f4480>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>

After this event, the network went offline, and all SAMBA hosted VM drives were disconnected. I was also unable to reach PVE or SSH or reach the server at all until a reboot.

I have an onboard network card which means I have now unplugged and disabled the card in question to avoid any further crashes. I will keep you posted if this resolves this also.
 
Hi,

Same issue here — I’ve experienced exactly the same behavior recently.


After this happened, the server became completely unreachable (no SSH, no Proxmox web UI). I had to plug in a keyboard and monitor directly to the machine to force a reboot.

For now, I’ve switched back to an older kernel (6.5.13-6-pve), and everything seems stable again — no more network hangs or crashes so far.

I’ll continue monitoring, but it definitely looks like the issue is related to the newer kernel.

I also restarted my backup jobs over NFS, which are launched from a VM running on this same Proxmox host, and there was no crash this time.

Thanks for sharing your case — very helpful.
 
Hi,

Same issue here — I’ve experienced exactly the same behavior recently.


After this happened, the server became completely unreachable (no SSH, no Proxmox web UI). I had to plug in a keyboard and monitor directly to the machine to force a reboot.

For now, I’ve switched back to an older kernel (6.5.13-6-pve), and everything seems stable again — no more network hangs or crashes so far.

I’ll continue monitoring, but it definitely looks like the issue is related to the newer kernel.

I also restarted my backup jobs over NFS, which are launched from a VM running on this same Proxmox host, and there was no crash this time.

Thanks for sharing your case — very helpful.
I do have a PBS instance installed with a mount to a second NAS I use for personal files and backups. I think I may wait 24-48 hours before I try to reinstate backups.
 
I have the same issue without backups on a Lenovo ThinkCenter M702Q and Proxmox PVE 8.4.

The network card stops working after a couple of hours (4-24) and Proxmox becomes useless and unreachable :(

Code:
kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                                         TDH                  <d5>
                                         TDT                  <e0>
                                         next_to_use          <e0>
                                         next_to_clean        <d4>
                                       buffer_info[next_to_clean]:
                                         time_stamp           <1013443e2>
                                         next_to_watch        <d5>
                                         jiffies              <10134e500>
                                         next_to_watch.status <0>
                                       MAC Status             <40080083>
                                       PHY Status             <796d>
                                       PHY 1000BASE-T Status  <3800>
                                       PHY Extended Status    <3000>
                                       PCI Status             <10>

Code:
modinfo e1000e | grep version
srcversion:     F65CFC0DF4BFD42B230512D
vermagic:       6.8.12-9-pve SMP preempt mod_unload modversions

Any idea how to solve this? It is pretty critical to have the network interface running.
 
Two options, boot in the previous kernel. Or use another network card on the device. I disabled my Intel e1000e network ard and used the one in my PCI slots with 4 ports.
 
Hi everyone,

I’d like to share a recent issue I encountered, in case it helps others.

Yesterday, I experienced a complete system freeze while copying files. I had previously attempted to boot into an older kernel (6.5.13-6-pve) using the following command:

Bash:
sudo grub-reboot "1>4" && sudo systemctl reboot
The "1>4" referred to the GRUB entry: Advanced options > kernel 6.5.13-6-pve. I believed this would force a one-time boot into the selected kernel, but after reboot, I realized the system was still running the latest kernel (6.8.12-11-pve). Most likely, a recent Proxmox upgrade had pulled in the new kernel, and I didn’t verify which one was actually running.

To ensure I was booting into the correct version and prevent further issues, I took the following steps:
  1. Checked available kernels
Bash:
proxmox-boot-tool kernel list
  1. Pinned the desired kernel (to make it persistent):
Bash:
proxmox-boot-tool kernel pin 6.5.13-6-pve
⚠️ This command tries to run update-grub, which failed on my system due to a missing binary. I resolved it with:
Bash:
ln -s /usr/sbin/grub-mkconfig /sbin/update-grub

  1. Another minor issue:
The script /etc/grub.d/000_proxmox_boot_header was failing because it couldn’t find proxmox-boot-tool. I just used the full path (/usr/sbin/proxmox-boot-tool) to avoid that error.
  1. Rebooted and verified kernel:
Bash:
uname -r

✅ Now correctly shows 6.5.13-6-pve.



I’ve since re-run all my backup scripts from a VM, and the Proxmox host is still stable and fully accessible. Everything seems back to normal.

Hope this helps someone who might run into similar behavior after a kernel upgrade!