[SOLVED] Intel NIC e1000e hardware unit hang

^ Out of curiousity neponn, can you post your Hardware Unit Hang error from today and see if it's any different than before?

I'm not seeing any documentation about the MAC or PHY statuses, but with enough information it might help lead us down the right path. I'm wondering if even one bit in the MAC or PHY status can indicate which settings need to change.
Sorry for the delay.... here is the second bout of error messages - looks the same to me:

Code:
Oct 15 05:39:44 proxmox kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                                  TDH                  <ee>
                                  TDT                  <cd>
                                  next_to_use          <cd>
                                  next_to_clean        <ed>
                                buffer_info[next_to_clean]:
                                  time_stamp           <107b87fd0>
                                  next_to_watch        <ee>
                                  jiffies              <107b88b00>
                                  next_to_watch.status <0>
                                MAC Status             <40080083>
                                PHY Status             <796d>
                                PHY 1000BASE-T Status  <3c00>
                                PHY Extended Status    <3000>
                                PCI Status             <10>

Great to hear that you have had success in changing to the e1000e driver. I found that using tso off gso off gro off (still with virtio drivers) gave me a few more days stability (and no hangs). But then I chickened out and switched to the Realtek RTL8153 based USB adapter I mentioned above. Performs just as well as the internal NIC, and - so far - no hangs.
 
Humph. Another hang. Once gain, not much network activity at the time. So tso off isn't enough. Now trying tso off gso off gro off. USB ethernet adapter on the way (Axagon ADE-SR).

Out of curiosity - has anyone found a way to reset the hung e1000 controller without rebooting the system? Wondering whether a supervisor script could be developed that detected a hang and took action to reinstate the controller? Given how infrequently hangs occur, this could be workable...
I managed to reset the hung controller by just unplugging ethernet cable and replug after a few seconds. The network comes back up after another few seconds.

Just happened today and last happened 5 days ago. Going to try a few workarounds suggested here and observe.