Intel NIC 'hangs' since last update

DerekG

Active Member
Mar 30, 2021
49
18
28
45
Hi all,

Can anyone advise on this error:

I'm running the following version of Proxmox:

Kernel Version - Linux 5.4.119-1-pve #1 SMP PVE 5.4.119-1 (Tue, 01 Jun 2021 15:32:00 +0200)

PVE Manager Version - pve-manager/6.4-8/185e14db

And it seems like since the last update my Intel NIC hangs with the following errors:

Jun 14 12:43:13 pve-1 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <2>
TDT <45>
next_to_use <45>
next_to_clean <1>
buffer_info[next_to_clean]:
time_stamp <101dbc441>
next_to_watch <2>
jiffies <101dbc5a8>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <7800>
PHY Extended Status <3000>
PCI Status <10>
Jun 14 12:43:15 pve-1 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <2>
TDT <45>
next_to_use <45>
next_to_clean <1>
buffer_info[next_to_clean]:
time_stamp <101dbc441>
next_to_watch <2>
jiffies <101dbc7a1>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <7800>
PHY Extended Status <3000>
PCI Status <10>

I don't remember this occurring before and the hang seems to be network load dependant, but I can't find the cause of the error..
Th NIC does reconnect after maybe 5-10 seconds, but in the meantime, whatever process the guests are working fails.

It could possibly be the NIC is faulty, but I can't swap to test as it's built-in on a SFF mini-PC.

Any advise would be much appreciated.

All the best.

Derek
 
Hi all,

Just to update this post.

I switched to another (USB) NIC last night (after changing cables and switch ports), and have zero dropped connection since the change. So unfortunately it looks like my Intel NIC might be the root cause of my problems.

I'll investigate further but all signs are now pointing to a defective NIC.

All the best

Derek
 
  • Like
Reactions: semanticbeeng
Hm - there are a few reports about problematic Intel (mostly e1000) NICs - e.g. https://forum.proxmox.com/threads/e1000-driver-hang.58284/

You could try to disable the hw-offloading features of the NIC (using ethtool - but should also be explained in the thread).

I hope this helps!

Thanks a bunch for pointing me to that thread, it's the exact issue I'm experiencing, although I haven't had the time to test the solutions yet.

All the best

Derek Gilzean
 
  • Like
Reactions: Stoiko Ivanov