e1000e eno1: Detected Hardware Unit Hang:

I just started looking into this issue and found here...testing the tso workaround.

I have three Dell Precision 3431s that use the Q246 C246 chipset. The first one I updated from PVE 8 to 9 started exhibiting this issue...more often when under network load like doing file transfers, but sometimes when it wasn't significant. The other two have not shown the problem...yet.

All are BIOS 1.32.0 (the latest when first deployed). 1.36.0 is out now, but details on fixes are pretty terse.
 
Last edited:
There is no "Q246" chipset, but only a "C246". And as mentioned, the original bug only should still affect older chipsets/systems (list see in my previous comment). So, if the issue persists and was not fixed with a BIOS update, then contact Dell. They should also be aware of this i219 Erratum.
 
There is no "Q246" chipset, but only a "C246". And as mentioned, the original bug only should still affect older chipsets/systems (list see in my previous comment). So, if the issue persists and was not fixed with a BIOS update, then contact Dell. They should also be aware of this i219 Erratum.

Oops, yes, that was a typo.
 
I have a few NUCs showing this behaviour, on 2 of them I have blacklisted the e1000e driver and use a different NIC which seems to work fine.

What I have noticed though, this only seems to happen on servers hosting VMs, I got one that only hosts LXCs which seems to be running forever and did migrate a VM across as I needed to perform some upgrades, as soon as the VM migrated across and started, the server crashed a few seconds/minutes later with the notorious log entry in the logs when I looked for a reason. When the server came back the upgrades had already completed and I migrated the VM away from the server back to the original hosting LCXs and VMs with the module being blacklisted, no issues whatsoever since.
 
I have just asked Claude

The reply was
Yes, almost certainly. Based on the Intel PRO/1000 PT Desktop Adapter identification, the chip under that heatsink is the Intel 82572GI controller, which uses the e1000e driver on Linux.
To confirm, if you have this card installed in a Linux system, you can run:

```
lspci | grep -i ethernet
```

Honestly unless you have high throughput requirements on the card I have seen no obvious impacts of the workaround and would do that
 
Last edited:
See Erratum Nr. 7: https://www.intel.com/content/dam/d...2571eb-82572ei-gbe-controller-spec-update.pdf
Just because it uses the "e1000e" driver does not necessarily mean, that it is affected by this bug.
You have to specifically check the Intel documentation on related topics. The problem might manifest itself slightly differently on those older chipsets. From what I saw, most poeple here saw it with i218/i219.
 
  • Like
Reactions: chrisn-au