e1000e eno1: Detected Hardware Unit Hang:

started getting this error recently.. the whole machine crashes and all containers become non-functional and unreachable.
I second this. I updated to Proxmox 8.4.1 and started receiving the server going offline. It appears it still function to some degree but the network goes down on the host. I applied this and will post back to confirm if it resolves the network hang.

root@proxmox:~# ethtool -K eno1 tx off rx off
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
rx-checksum: off
 
  • Like
Reactions: redactedhosting
started getting this error recently.. the whole machine crashes and all containers become non-functional and unreachable.
Same for me, suddenly i found not access my proxmox via network or local... Luckely i was home and could restart it manually.. but now in worried that this might happen when im not at home.

Code:
May 26 06:40:46 proxmox kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                                  TDH                  <c1>
                                  TDT                  <26>
                                  next_to_use          <26>
                                  next_to_clean        <c0>
                                buffer_info[next_to_clean]:
                                  time_stamp           <191cc38c1>
                                  next_to_watch        <c1>
                                  jiffies              <191cc5a00>
                                  next_to_watch.status <0>
                                MAC Status             <80083>
                                PHY Status             <796d>
                                PHY 1000BASE-T Status  <3800>
                                PHY Extended Status    <3000>
                                PCI Status             <10>

Edit: 6h later, this happens again, installed via link to community script... Will see if this works better or not.

Just a thought, Sorry for being new but will this je in a fix in a coming update? Do i need to remove this script in the future?

Edit 2: almost 24h later, not got any issue yet, still running and all seems ok. No error in logs either.
 
Last edited:
I am closely following this bug in the Bugzilla of Proxmox, but it is specific to kernel version 6.8.12-9-pve and above, as it affects my PBS servers. However, I am also in charge of Proxmox VE 7.4-20 nodes still running kernels 5.15.158-2, which also seem affected by the NIC bug.

Would anyone know which version of 5.15 kernel is not affected by it, so I could revert to it, as I have done with my PBS? I think it would be version 5.15.152-1-pve, but I am not completely sure.
 
I updated to kernel version 6.14.5-1-bpo12-pve today and had my NIC hang for the first time. I can't remember which version of 6.8 I was on prior to the update.
 
Reverting back to version 6.8.12-8-pve should solve the issue.
I just disabled all of the offloading as mentioned in the Bugzilla link you provided. I wanted to give the newer kernel a shot based on the VFIO_MAP_DMA failed: Invalid argument errors I've been getting on VMs with PCIe passthrough.
 
Just had this error today, my proxmox is running on my old gaming PC parts (an ASUS motherboard w/ onboard eth port). Not a NUC. I just ran an apt update (not a proxmox subscriber though so I used the no-sub repos) so we'll see. It's on 6.8.12-11-pve now.
 
Last edited:
I encountered the issue today a couple of times (it was ok for about 1 week), on Lenovo M920x Tiny, with pve `v8.4-1` kernel ` 6.8.12-11-pve`. The NIC is `Intel Ethernet I219-LM`.

```
[ 4458.456540] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <27>
TDT <9b>
next_to_use <9b>
next_to_clean <26>
buffer_info[next_to_clean]:
time_stamp <1003f33fb>
next_to_watch <27>
jiffies <1003f7500>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
```

Btw, unplug the UTP cable and plugin again, the NIC reworks.
 
Experienced this bullshit today...

Code:
Jun 13 22:39:42 pve-h kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                                TDH                  <31>
                                TDT                  <61>
                                next_to_use          <61>
                                next_to_clean        <30>
                              buffer_info[next_to_clean]:
                                time_stamp           <110bb8e9a>
                                next_to_watch        <31>
                                jiffies              <110bbcf40>
                                next_to_watch.status <0>
                              MAC Status             <80083>
                              PHY Status             <796d>
                              PHY 1000BASE-T Status  <7800>
                              PHY Extended Status    <3000>
                              PCI Status             <10>
                              
# uname -a
Linux pve-h 6.8.12-11-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-11 (2025-05-22T09:39Z) x86_64 GNU/Linux
 
Hello all! This started for me about 3 days ago which is right around when I started upgrading all my nodes. I have opted for a temporary fix while I wait to see the results: `ethtool -K eno1 tso off gso off gro off sg off`. An annoying problem but I'm glad to see that I'm not alone.

Side note for anyone running a "mesh" cluster... your cluster network does not care about the state of your nodes to the rest of the network. Egg on my face trying to explain how my HA 3-node cluster was unresponsive and still reporting "healthy"... will be adding a backup NIC to all nodes ASAP.
 
Same here: :mad:
Jun 18 05:53:14 pve1 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <ca>
TDT <e4>
next_to_use <e4>
next_to_clean <c9>
buffer_info[next_to_clean]:
time_stamp <106e21ded>
next_to_watch <ca>
jiffies <106e23000>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3c00>
PHY Extended Status <3000>
PCI Status <10>

Now I try this workaround and will post my result in the next days:
https://gist.github.com/crypt0rr/60aaabd4a5c29a256b4f276122765237
 
I think the only real solution is to rollback your kernel though I don't know what version. I tried to turn a few things off:
Bash:
ethtool -K eno1 tso off gso off gro off sg off
... but this didn't change anything. Had another failure yesterday and oddly, my "active-backup" bond didn't detect the failure either. The only way I can reliably prevent this from dying is a round-robin bond with a second NIC (or just permanently change the NIC).

Hoping to see a new kernel fix soon - my server's reputation for reliability has taken a beating these past couple weeks!
 
I am experiencing the same issue recently, starting after the update, any update?
 
Same here: :mad:
Jun 18 05:53:14 pve1 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <ca>
TDT <e4>
next_to_use <e4>
next_to_clean <c9>
buffer_info[next_to_clean]:
time_stamp <106e21ded>
next_to_watch <ca>
jiffies <106e23000>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3c00>
PHY Extended Status <3000>
PCI Status <10>

Now I try this workaround and will post my result in the next days:
https://gist.github.com/crypt0rr/60aaabd4a5c29a256b4f276122765237
Result: Until now no more crash or hardware hang. :)
 
  • Like
Reactions: ming019
Y’all wanna hear a funny story?


This happened to me for the first time ever—as I was boarding a plane to Europe for a two-week vacation!


I’ve got Pi-hole running on my server, which also handles DHCP. So every time Proxmox VE (PVE) would drop off the network, everything else—pet cameras, feeders, thermostat, Ring, etc.—would appear offline. Because of the timing, I genuinely thought we were being targeted and robbed. Totally my fault for doing an update right before leaving...


While I was gone, I kept having my pet sitters and friends restart the router whenever they were over, just so I could try and remotely diagnose what the heck was going on. Rebooting router helped, so I knew it was a network-related issue. But there’s only so much you can do over VPN from across the world—and I really didn’t feel like doing “work” while I was supposed to be relaxing.


Eventually, I just accepted the situation and decided to deal with it when I got back.


Well—I’m back now and finally figured it out. Big thanks to everyone here for sharing your experiences and documenting your fixes. I’ll be trying the solution mentioned above. (running kernel: 6.8.12-11-pve)


P.S. In the future, if I were to upgrade the physical NIC, what’s the consensus on a good one?
Dell Precision 7810 using the onboard NIC right now

root@prox-precision:~# lspci -nn | grep Ethernet
00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection I217-LM [8086:153a] (rev 05)
 
Last edited:
  • Like
Reactions: theredtophat
I experienced problems with the e1000 driver and the onboard NIC on my ancient motherboard. I gave up using it for a VLAN trunk and resorted to the other built-in NIC (Atheros).
I eventually disabled both onboard NICs and purchased a secondhand Intel i350 (2 port version). If you are happy with gigabit networking, these and the 4 port versions are plentiful on ebay. Many are supposedly fake, but I reasoned if they are working 5+ years later, even the purported fakes should be OK. Not a single issue or driver warning and I got the unexpected bonus of an extra thermal sensor!
 
Still having this issue, i thought it was a backup issue that was resulting in hang but its only my 2nd node, im going to try ethtool -K eno1 gso off tso off rxvlan off txvlan off gro off tx off rx off sg off and see if that helps, issue still persistant on
Proxmox
Virtual Environment 8.4.5 :( - RH
 
I second this. I updated to Proxmox 8.4.1 and started receiving the server going offline. It appears it still function to some degree but the network goes down on the host. I applied this and will post back to confirm if it resolves the network hang.

root@proxmox:~# ethtool -K eno1 tx off rx off
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
rx-checksum: off
This resolved the issue on my end.