Network failure with kernel 6.8.12-11-pve and Intel NIC (e1000e): host unreachable after backup job

heliospeed · Jul 10, 2025

Hi,

I’m running Proxmox VE 8.4.1 and I’ve been experiencing repeated network loss with kernel 6.8.12-11-pve. After running a large Docker backup job from a VM, the network on the host suddenly becomes unreachable (ping: “host is down”) and I have to physically reboot the server.

This only happens with kernel 6.8.12-11-pve. When I switch back to kernel 6.5.13-6-pve, the problem disappears entirely.

I’m using an Intel NIC with the e1000e driver. Here is the output of lspci -nn | grep -i eth:

Bash:

00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)

During the failure, dmesg shows repeated messages like:

Bash:

e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
  TDH, TDT, next_to_use, next_to_clean...

I also disabled GRO/TSO offloading, with no effect:

Bash:

/sbin/ethtool -K eno1 gro off gso off tso off

This issue started appearing shortly after recent kernel upgrades.

Here’s the output of pveversion -v:

Bash:

proxmox-ve: 8.4.0 (running kernel: 6.5.13-6-pve)
pve-manager: 8.4.1/2a5fa54a8503f96d
...
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
...

Could this be a regression in the e1000e driver in the 6.8 kernel series?

Is there a recommended workaround or patch? I can stick to 6.5.13-6-pve for now, but I’d like to know if this is being tracked.

Thanks in advance!

lp0nfire · Jul 10, 2025

Hello heliospeed,

Also had this start happening (Without a backup run on my Home Lab too.

Jul 09 21:08:07 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <5e>
TDT <e4>
next_to_use <e4>
next_to_clean <5d>
buffer_info[next_to_clean]:
time_stamp <1002f2c10>
next_to_watch <5e>
jiffies <1002f4480>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>

After this event, the network went offline, and all SAMBA hosted VM drives were disconnected. I was also unable to reach PVE or SSH or reach the server at all until a reboot.

I have an onboard network card which means I have now unplugged and disabled the card in question to avoid any further crashes. I will keep you posted if this resolves this also.

heliospeed · Jul 10, 2025

Hi,

Same issue here — I’ve experienced exactly the same behavior recently.

After this happened, the server became completely unreachable (no SSH, no Proxmox web UI). I had to plug in a keyboard and monitor directly to the machine to force a reboot.

For now, I’ve switched back to an older kernel (6.5.13-6-pve), and everything seems stable again — no more network hangs or crashes so far.

I’ll continue monitoring, but it definitely looks like the issue is related to the newer kernel.

I also restarted my backup jobs over NFS, which are launched from a VM running on this same Proxmox host, and there was no crash this time.

Thanks for sharing your case — very helpful.

lp0nfire · Jul 10, 2025

heliospeed said:
Hi,

Same issue here — I’ve experienced exactly the same behavior recently.

After this happened, the server became completely unreachable (no SSH, no Proxmox web UI). I had to plug in a keyboard and monitor directly to the machine to force a reboot.

For now, I’ve switched back to an older kernel (6.5.13-6-pve), and everything seems stable again — no more network hangs or crashes so far.

I’ll continue monitoring, but it definitely looks like the issue is related to the newer kernel.

I also restarted my backup jobs over NFS, which are launched from a VM running on this same Proxmox host, and there was no crash this time.

Thanks for sharing your case — very helpful.

I do have a PBS instance installed with a mount to a second NAS I use for personal files and backups. I think I may wait 24-48 hours before I try to reinstate backups.

davidand · Jul 26, 2025

I have the same issue without backups on a Lenovo ThinkCenter M702Q and Proxmox PVE 8.4.

The network card stops working after a couple of hours (4-24) and Proxmox becomes useless and unreachable

Code:

kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                                         TDH                  <d5>
                                         TDT                  <e0>
                                         next_to_use          <e0>
                                         next_to_clean        <d4>
                                       buffer_info[next_to_clean]:
                                         time_stamp           <1013443e2>
                                         next_to_watch        <d5>
                                         jiffies              <10134e500>
                                         next_to_watch.status <0>
                                       MAC Status             <40080083>
                                       PHY Status             <796d>
                                       PHY 1000BASE-T Status  <3800>
                                       PHY Extended Status    <3000>
                                       PCI Status             <10>

Code:

modinfo e1000e | grep version
srcversion:     F65CFC0DF4BFD42B230512D
vermagic:       6.8.12-9-pve SMP preempt mod_unload modversions

Any idea how to solve this? It is pretty critical to have the network interface running.

lp0nfire · Jul 26, 2025

Two options, boot in the previous kernel. Or use another network card on the device. I disabled my Intel e1000e network ard and used the one in my PCI slots with 4 ports.

MarkusKo · Jul 26, 2025

you could try this

heliospeed · Jul 27, 2025

Hi everyone,

I’d like to share a recent issue I encountered, in case it helps others.

Yesterday, I experienced a complete system freeze while copying files. I had previously attempted to boot into an older kernel (6.5.13-6-pve) using the following command:

Bash:

sudo grub-reboot "1>4" && sudo systemctl reboot

The "1>4" referred to the GRUB entry: Advanced options > kernel 6.5.13-6-pve. I believed this would force a one-time boot into the selected kernel, but after reboot, I realized the system was still running the latest kernel (6.8.12-11-pve). Most likely, a recent Proxmox upgrade had pulled in the new kernel, and I didn’t verify which one was actually running.

To ensure I was booting into the correct version and prevent further issues, I took the following steps:

Checked available kernels

Bash:

proxmox-boot-tool kernel list

Pinned the desired kernel (to make it persistent):

Bash:

proxmox-boot-tool kernel pin 6.5.13-6-pve

This command tries to run update-grub, which failed on my system due to a missing binary. I resolved it with:

Bash:

ln -s /usr/sbin/grub-mkconfig /sbin/update-grub

Another minor issue:

The script /etc/grub.d/000_proxmox_boot_header was failing because it couldn’t find proxmox-boot-tool. I just used the full path (/usr/sbin/proxmox-boot-tool) to avoid that error.

Rebooted and verified kernel:

Bash:

uname -r

Now correctly shows 6.5.13-6-pve.

I’ve since re-run all my backup scripts from a VM, and the Proxmox host is still stable and fully accessible. Everything seems back to normal.

Hope this helps someone who might run into similar behavior after a kernel upgrade!

Search

Search

Network failure with kernel 6.8.12-11-pve and Intel NIC (e1000e): host unreachable after backup job

heliospeed

New Member

lp0nfire

New Member

heliospeed

New Member

lp0nfire

New Member

davidand

Well-Known Member

lp0nfire

New Member

MarkusKo

Active Member

heliospeed

New Member

We value your privacy