e1000 driver hang

tssge

New Member
Mar 20, 2020
5
0
1
26
Just to report back here... Since adding the "ethtool -K eno1 tso off gso off" to postup (about a week ago), I haven't had any further occurrences of the "Detected Hardware Unit Hang" issue... So it looks like only "tso off gso off" are required and not all the other parameters
Yes, for example disabling VLAN offload is required only if VLANs are used. It makes sense for the other features as well: if you're never getting UDP, a UDP offload won't trigger the bug in your card

But then again if you're not using VLANs, there's very little sense in keeping the offload on anyways.
 

Hyacin

New Member
May 6, 2020
8
0
1
39
Absolute weirdness for me. I have three identical NUC10i3FNs running identical versions of PVE. tso off gso off rxvlan off txvlan off fixed one of two that were acting up (I use vlan aware), on the other that is still acting up I've increased it now to rxvlan off txvlan off gso off gro off tso off tx off rx off, and on the third, I haven't had any issues at all with all of it on!!

So, FYI to everyone coming across this in the future - it's not a one-size fits all silver bullet. Either turn it all off at the start if you want to nip it in the bud as quickly as possible (and suffer the full weight of doing all that without hardware assist), or go bit by bit until it stops!

The most shocking thing to me though is that my 3rd box doesn't seem to need any of it to be turned off and it's plugging along just fine without a single hiccup.
 

tssge

New Member
Mar 20, 2020
5
0
1
26
The most shocking thing to me though is that my 3rd box doesn't seem to need any of it to be turned off and it's plugging along just fine without a single hiccup.
It seems that certain kind of traffic triggers this issue. Once it's triggered, it'll continue to bother you until you restart the box. Now I have no idea what causes it specifically, but I am pretty sure that it requires some certain kind of packets to be triggered.

Some of my machines seem to stay on with no issues for quite some time, but eventually all of them develop this issue at some point.
 

fireon

Famous Member
Oct 25, 2010
3,351
237
83
38
Austria/Graz
iteas.at
soooooo strange. same problem here, only on one interface of dual nic... after years with proxmox on there. Maybe the biosupdate from supermicro fix this.
 

Hyacin

New Member
May 6, 2020
8
0
1
39
I went out and bought a few $15 Realtek (chip) USB-C to GigE NICs, partially because I'd seen, I believe in this thread, that the performance of the onboard NIC takes a major hit with the offloading disabled, and also because I'd like to keep my iSCSI traffic on it's own link I think (very thankful I had an additional reason, lol) -

Code:
root@NUC10i3FNH-3:/# ip link set vmbr0.10 down
root@NUC10i3FNH-3:/# scp -oBindAddress=172.24.0.14 testfile rob@172.24.0.55: # Onboard Intel NIC (with stuff disabled for stability)
rob@172.24.0.55's password:
testfile                                                                                                                                  100% 5000MB 110.2MB/s   00:45
root@NUC10i3FNH-3:/# ip link set vmbr0.10 up
root@NUC10i3FNH-3:/# ip link set vmbr1.10 down
root@NUC10i3FNH-3:/# scp -oBindAddress=172.24.0.12 testfile rob@172.24.0.55: # USB-C NIC
rob@172.24.0.55's password:
testfile                                                                                                                                  100% 5000MB 110.8MB/s   00:45
root@NUC10i3FNH-3:/#
Apparently the 10th gen i3 doesn't break a sweat doing the functions that were formerly offloaded to the NIC hardware.
o_O

Oh, and in the reverse direction -

Code:
root@NUC10i3FNH-3:/# scp -oBindAddress=172.24.0.12 rob@172.24.0.55:testfile . # USB-C NIC
rob@172.24.0.55's password:
testfile                                                                                                                                  100% 5000MB 108.0MB/s   00:46
root@NUC10i3FNH-3:/# ip link set vmbr1.10 up
root@NUC10i3FNH-3:/# ip link set vmbr0.10 down
root@NUC10i3FNH-3:/# scp -oBindAddress=172.24.0.14 rob@172.24.0.55:testfile . # Onboard Intel NIC (with stuff disabled for stability)
rob@172.24.0.55's password:
testfile                                                                                                                                  100% 5000MB 108.6MB/s   00:46
root@NUC10i3FNH-3:/#
 
Last edited:

mlrtime

New Member
Dec 2, 2018
13
0
1
50
What are the current recommended settings and how are you guys setting, I currently have this in crontab:

@reboot /usr/sbin/ethtool -K vmbr0 gso off gro off tso off >> /tmp/ethtool.fix 2>&1
@reboot /usr/sbin/ethtool -K eno1 gso off gro off tso off >> /tmp/ethtool.fix 2>&1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!