e1000 driver hang

Gorki · Mar 29, 2025

Code:

  GNU nano 7.2                                                                                  interfaces                                                                                           
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface enp0s25 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.***.***.***/24
        gateway 192.***.***.*
        bridge-ports enp0s25
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

So in my case it is enp0s25 or vmbr0? And where to place the line?

Thank you for the help - I hope the best to have proxmox high available again.

OrionNE · Mar 29, 2025

Gorki said:

Code:

  GNU nano 7.2                                                                                  interfaces                                                                                         
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface enp0s25 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.***.***.***/24
        gateway 192.***.***.*
        bridge-ports enp0s25
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

So in my case it is enp0s25 or vmbr0? And where to place the line?

Thank you for the help - I hope the best to have proxmox high available again.

Correct your iface is: enp0s25

here is what my interfaces file looks like:
----------------------------------------------------------------------------------------------------------------------------

auto lo
iface lo inet loopback

iface eno1 inet manual
post-up ethtool -K eno1 gso off tso off rxvlan off txvlan off gro off tx off rx off sg off

auto vmbr0
iface vmbr0 inet static
address 192.168.0.2/24
gateway 192.168.0.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

source /etc/network/interfaces.d/*

----------------------------------------------------------------------------------------------------------------------------

Hope this helps you.
I am new to Proxmox also only been running it for a few months.
But I have been using VMware ESXi since it was called GSX in the early 2000's

Have a great day!

Gorki · Mar 31, 2025

24 hours without an outage. So it looks good.

OrionNE · Mar 31, 2025

Gorki said:
24 hours without an outage. So it looks good.

I am over 72hr so it seems to have alleviated the problem.

Have a great day!

manusfreedom · Apr 1, 2025

"Same" Problem using kernel Proxmox VE GNU/Linux, with Linux 6.8.12-9-pve after monthly update, network unavailable and syslog loop of :

Code:

2025-04-01T04:19:44.099279+02:00 kernel: [ 7197.924206] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
2025-04-01T04:19:44.099300+02:00 kernel: [ 7197.924206]   TDH                  <ce>
2025-04-01T04:19:44.099303+02:00 kernel: [ 7197.924206]   TDT                  <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_use          <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_clean        <cd>
2025-04-01T04:19:44.099306+02:00 kernel: [ 7197.924206] buffer_info[next_to_clean]:
2025-04-01T04:19:44.099308+02:00 kernel: [ 7197.924206]   time_stamp           <1006936fb>
2025-04-01T04:19:44.099309+02:00 kernel: [ 7197.924206]   next_to_watch        <ce>
2025-04-01T04:19:44.099310+02:00 kernel: [ 7197.924206]   jiffies              <100694041>
2025-04-01T04:19:44.099311+02:00 kernel: [ 7197.924206]   next_to_watch.status <0>
2025-04-01T04:19:44.099312+02:00 kernel: [ 7197.924206] MAC Status             <40080083>
2025-04-01T04:19:44.099313+02:00 kernel: [ 7197.924206] PHY Status             <796d>
2025-04-01T04:19:44.099315+02:00 kernel: [ 7197.924206] PHY 1000BASE-T Status  <3800>
2025-04-01T04:19:44.099316+02:00 kernel: [ 7197.924206] PHY Extended Status    <3000>
2025-04-01T04:19:44.099317+02:00 kernel: [ 7197.924206] PCI Status             <10>

Back to kernel Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve (before the next crash) using:

Code:

# Set in /etc/default/grub
GRUB_DEFAULT="Advanced options for Proxmox VE GNU/Linux>Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve"

# Update  grub
PATH=$PATH:/usr/sbin
update-grub

# Reboot to apply
systemctl reboot

OrionNE · Apr 1, 2025

manusfreedom said:

"Same" Problem using kernel Proxmox VE GNU/Linux, with Linux 6.8.12-9-pve after monthly update, network unavailable and syslog loop of :

Code:

2025-04-01T04:19:44.099279+02:00 kernel: [ 7197.924206] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
2025-04-01T04:19:44.099300+02:00 kernel: [ 7197.924206]   TDH                  <ce>
2025-04-01T04:19:44.099303+02:00 kernel: [ 7197.924206]   TDT                  <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_use          <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_clean        <cd>
2025-04-01T04:19:44.099306+02:00 kernel: [ 7197.924206] buffer_info[next_to_clean]:
2025-04-01T04:19:44.099308+02:00 kernel: [ 7197.924206]   time_stamp           <1006936fb>
2025-04-01T04:19:44.099309+02:00 kernel: [ 7197.924206]   next_to_watch        <ce>
2025-04-01T04:19:44.099310+02:00 kernel: [ 7197.924206]   jiffies              <100694041>
2025-04-01T04:19:44.099311+02:00 kernel: [ 7197.924206]   next_to_watch.status <0>
2025-04-01T04:19:44.099312+02:00 kernel: [ 7197.924206] MAC Status             <40080083>
2025-04-01T04:19:44.099313+02:00 kernel: [ 7197.924206] PHY Status             <796d>
2025-04-01T04:19:44.099315+02:00 kernel: [ 7197.924206] PHY 1000BASE-T Status  <3800>
2025-04-01T04:19:44.099316+02:00 kernel: [ 7197.924206] PHY Extended Status    <3000>
2025-04-01T04:19:44.099317+02:00 kernel: [ 7197.924206] PCI Status             <10>

Back to kernel Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve (before the next crash) using:

Code:

# Set in /etc/default/grub
GRUB_DEFAULT="Advanced options for Proxmox VE GNU/Linux>Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve"

# Update  grub
PATH=$PATH:/usr/sbin
update-grub

# Reboot to apply
systemctl reboot

That is a good quick fix. Hopefully next kernel update will fix issue. I am on the fence on setting the nic setting back as it has not caused any issues for me with new settings. We shall see. Thanks for posting and confirming that rolling back a kernel resolved your issue.

-thx

bunk3m · Apr 3, 2025

manusfreedom said:

"Same" Problem using kernel Proxmox VE GNU/Linux, with Linux 6.8.12-9-pve after monthly update, network unavailable and syslog loop of :

Code:

2025-04-01T04:19:44.099279+02:00 kernel: [ 7197.924206] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
2025-04-01T04:19:44.099300+02:00 kernel: [ 7197.924206]   TDH                  <ce>
2025-04-01T04:19:44.099303+02:00 kernel: [ 7197.924206]   TDT                  <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_use          <e2>
2025-04-01T04:19:44.099304+02:00 kernel: [ 7197.924206]   next_to_clean        <cd>
2025-04-01T04:19:44.099306+02:00 kernel: [ 7197.924206] buffer_info[next_to_clean]:
2025-04-01T04:19:44.099308+02:00 kernel: [ 7197.924206]   time_stamp           <1006936fb>
2025-04-01T04:19:44.099309+02:00 kernel: [ 7197.924206]   next_to_watch        <ce>
2025-04-01T04:19:44.099310+02:00 kernel: [ 7197.924206]   jiffies              <100694041>
2025-04-01T04:19:44.099311+02:00 kernel: [ 7197.924206]   next_to_watch.status <0>
2025-04-01T04:19:44.099312+02:00 kernel: [ 7197.924206] MAC Status             <40080083>
2025-04-01T04:19:44.099313+02:00 kernel: [ 7197.924206] PHY Status             <796d>
2025-04-01T04:19:44.099315+02:00 kernel: [ 7197.924206] PHY 1000BASE-T Status  <3800>
2025-04-01T04:19:44.099316+02:00 kernel: [ 7197.924206] PHY Extended Status    <3000>
2025-04-01T04:19:44.099317+02:00 kernel: [ 7197.924206] PCI Status             <10>

Back to kernel Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve (before the next crash) using:

Code:

# Set in /etc/default/grub
GRUB_DEFAULT="Advanced options for Proxmox VE GNU/Linux>Proxmox VE GNU/Linux, with Linux 6.8.12-8-pve"

# Update  grub
PATH=$PATH:/usr/sbin
update-grub

# Reboot to apply
systemctl reboot

Don't you need to run ```proxmox-boot-tool refresh```?

From: https://pve.proxmox.com/wiki/Host_Bootloader#sysboot_kernel_pin and run the command

Code:

$ proxmox-boot-tool kernel pin 6.8.12-8-pve --next-boot
Set kernel '6.8.12-8-pve' in /etc/kernel/next-boot-pin.
Refresh the actual boot ESPs now? [yN] y
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/EF83-8444
        Copying kernel and creating boot-entry for 5.15.158-2-pve
        Copying kernel and creating boot-entry for 6.8.12-8-pve
        Copying kernel and creating boot-entry for 6.8.12-9-pve
Copying and configuring kernels on /dev/disk/by-uuid/EFA0-85BF
        Copying kernel and creating boot-entry for 5.15.158-2-pve
        Copying kernel and creating boot-entry for 6.8.12-8-pve
        Copying kernel and creating boot

```
This will ensure the next boot is into 6.8.12-8-pve but without pinning this kernel for the long term. You probably want this so when the next update comes along you will install that kernel and use it too. If you pin this kernel, then you will need to unpin when the next security update comes along.

manusfreedom · Apr 3, 2025

bunk3m said:
Don't you need to run ```proxmox-boot-tool refresh```?

From: https://pve.proxmox.com/wiki/Host_Bootloader#sysboot_kernel_pin and run the command

Code:

$ proxmox-boot-tool kernel pin 6.8.12-8-pve --next-boot Set kernel '6.8.12-8-pve' in /etc/kernel/next-boot-pin. Refresh the actual boot ESPs now? [yN] y Running hook script 'proxmox-auto-removal'.. Running hook script 'zz-proxmox-boot'.. Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace.. Copying and configuring kernels on /dev/disk/by-uuid/EF83-8444 Copying kernel and creating boot-entry for 5.15.158-2-pve Copying kernel and creating boot-entry for 6.8.12-8-pve Copying kernel and creating boot-entry for 6.8.12-9-pve Copying and configuring kernels on /dev/disk/by-uuid/EFA0-85BF Copying kernel and creating boot-entry for 5.15.158-2-pve Copying kernel and creating boot-entry for 6.8.12-8-pve Copying kernel and creating boot

```
This will ensure the next boot is into 6.8.12-8-pve but without pinning this kernel for the long term. You probably want this so when the next update comes along you will install that kernel and use it too. If you pin this kernel, then you will need to unpin when the next security update comes along.

Yes, but the proxmox-boot-tool doesn't exist on all installations (like mine).

bunk3m · Apr 4, 2025

manusfreedom said:
Yes, but the proxmox-boot-tool doesn't exist on all installations (like mine).

Didn't realize that.
Thanks for letting me know.
Learning more every day!
Thanks.

PS it's been 24 hrs and running 6.8.12-8-pve doesn't have the hang.

jellyfish_jessie3 · Apr 4, 2025

I have two dell optiplex SFF boxes, updated both on Sunday to latest (6.8.12-9-pve) including all packages too.
One started having the e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: issues and if you ignore them for a couple hours eventually I assume the kernel kills the interface or it kills itself, which is super annoying the interface is completely unresponsive and needs to be hard rebooted to come back.

The super weird part is I have two boxes with same interfaces but only one has the issue. They both have Ethernet controller: Intel Corporation Ethernet Connection (7) I219-LM (rev 10) and same driver and kernel modules (confirmed with lspci -v) the only difference is the "memory at" value, and the IOMMU group.

On Monday I rolled back the apt packages i updated and pinned the kernal to 6.8.12-3-pve and am still having issues. From manus post I will try 6.8.12-8-pve.

if it helps i can provide my lspci output and apt packages. Has anyone else found anything through troubleshooting? Or know what could have caused it?

edit: updated formatting, i used markdown which didn't translate

edit: there seems to be a bug report for this https://bugzilla.proxmox.com/show_bug.cgi?id=6273 (found from this thread https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/post-759225)

fabian · Apr 4, 2025

you could try the newer opt-in kernel, it seems the recent regression for the I219 cards doesn't affect it (see the linked bug report)

jellyfish_jessie3 · Apr 4, 2025

So 6.8.12-8-pve didn't fix the issue

Code:

[31245.931768] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d831c3>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31247.916805] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d83984>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31249.960960] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d84180>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31251.947341] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d84942>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31253.161613] e1000e 0000:00:1f.6 eno1: NETDEV WATCHDOG: CPU: 1: transmit queue 0 timed out 9433 ms
[31253.161626] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
[31253.251063] vmbr0: port 1(eno1) entered disabled state
[31256.877456] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

I will try 6.11.11-2-pve from the linked bug report.

jellyfish_jessie3 · Apr 5, 2025

jellyfish_jessie3 said:

So 6.8.12-8-pve didn't fix the issue

Code:

[31245.931768] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d831c3>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31247.916805] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d83984>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31249.960960] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d84180>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31251.947341] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                 TDH                  <81>
                 TDT                  <c0>
                 next_to_use          <c0>
                 next_to_clean        <80>
               buffer_info[next_to_clean]:
                 time_stamp           <101d8267a>
                 next_to_watch        <81>
                 jiffies              <101d84942>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[31253.161613] e1000e 0000:00:1f.6 eno1: NETDEV WATCHDOG: CPU: 1: transmit queue 0 timed out 9433 ms
[31253.161626] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
[31253.251063] vmbr0: port 1(eno1) entered disabled state
[31256.877456] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

I will try 6.11.11-2-pve from the linked bug report.

6.11.11-2 installed around 6 hours ago and already crashed. I am not sure if caused from the firmware, but the interface hang was way quicker than previous kernel versions

rechena · Apr 9, 2025

Sorry if the question is on the wrong place or not appropriate.

I've recently started experiencing this hang also after upgrading from 7.X to 8.x with the latest kernel 6.8.12-9. I only notice the hang when the Proxmox is running his backups to a NAS.

I've found the e10000 bug proxmox page and ran disabled offloading with:

ethtool -K eno1 gso off gro off tso off tx off rx off rxvlan off txvlan off sg off

But then when I tried a single restore the network speed is barely any... making me wonder if by offloading I've actually found or introduced another issue?

The hardware is a NUC with 32GB of memory. I've also noticed these in the logs when I was running the restore of a single VM:

VM 101 qmp command failed - VM 101 qmp command 'query-proxmox-support' failed - unable to connect to VM 101 qmp socket - timeout after 51 retries.

I've reverted to only have a couple things offloaded to see if it improves, this is my current ethtool -k

Code:

ethtool -k eno1
Features for eno1:
rx-checksumming: on
tx-checksumming: on
    tx-checksum-ipv4: off [fixed]
    tx-checksum-ip-generic: on
    tx-checksum-ipv6: off [fixed]
    tx-checksum-fcoe-crc: off [fixed]
    tx-checksum-sctp: off [fixed]
scatter-gather: on
    tx-scatter-gather: on
    tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
    tx-tcp-segmentation: off
    tx-tcp-ecn-segmentation: off [fixed]
    tx-tcp-mangleid-segmentation: off
    tx-tcp6-segmentation: off
generic-segmentation-offload: off
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

Any help, tips would be greatly appreciated.

jimmycav · Apr 11, 2025

fabian said:
you could try the newer opt-in kernel, it seems the recent regression for the I219 cards doesn't affect it (see the linked bug report)

The linked bug report now indicates that the opt-in kernel 6.11 also has the issue

Hanhi · Apr 12, 2025

jimmycav said:
The linked bug report now indicates that the opt-in kernel 6.11 also has the issue

I just added latest 6.14.0-2-pve to it.

root@proxmoxnode:~# uname -a
Linux proxmoxnode 6.14.0-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.14.0-2 (2025-04-10T17:57Z) x86_64 GNU/Linux

root@proxmoxnode:~# lspci | grep net
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (17) I219-LM (rev 11)

Apr 12 04:10:33 proxmoxnode kernel: e1000e 0000:00:1f.6 eno2: Detected Hardware Unit Hang:
TDH <7b>
TDT <c5>
next_to_use <c5>
next_to_clean <7a>
buffer_info[next_to_clean]:
time_stamp <1007c2bbe>
next_to_watch <7b>
jiffies <1010ae300>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <7800>
PHY Extended Status <3000>
PCI Status <10>

MrJoe · Apr 12, 2025

I was also pleased with the hang e1000e driver (card: I219-LM rev 10) since I updated my Proxmox 14 days ago. In my case the tso off gso off was the key. Thanks to all in this thread for pointing me to that solution. I have no performance issues and since ~5 days no further issue (before the hack the issue occurs every day at least once).

I added in addition a kind of watchdog to at least enable me from remote to access the Proxmox server again (my e1000e is the main network connection and therefore a hang up is quite an issue if you are not in front of the machine). Here is the script which I start via systemd:

Bash:

#!/bin/bash

error=0
myDate=""
logFile="/var/log/check_network.log"
message=""

while true
do
  # ping internal router to check network connection
  ret=$(/usr/bin/ping -c1 -W1 -q aaa.bbb.ccc.ddd >/dev/null; /usr/bin/echo $?)
  if [[ "0" != "$ret" ]]; then
    sleep 30
    ret=$(/usr/bin/ping -c1 -W1 -q aaa.bbb.ccc.ddd >/dev/null; /usr/bin/echo $?)
  fi
  if [[ "0" != "$ret" ]]; then
    # ping failed, set error flag and remember time
    error=1
    if [[ "" == "$myDate" ]]; then
      myDate=$(/usr/bin/date +"%d.%m.%Y %H:%M")
    fi
  else
    # ping succeeded
    if [[ "" != "$message" ]]; then
      # send the message via mail   
      #/usr/local/bin/sendMessage.sh "$message"
      message=""
    fi

    if [[ ! $error == 0 ]]; then
      curDate=$(/usr/bin/date +"%d.%m.%Y %H:%M")
      echo "$curDate: Fault detected on $myDate! Restart Network!" >>$logFile
      /usr/bin/systemctl restart networking
      message="Network fail - $myDate"
      error=0
      myDate=""
    fi
  fi
  /usr/bin/sleep 300
done

Just replace aaa.bbb.ccc.ddd with an IP address of e.g. your internal router which should be always reachable. The script checks regularly every 5min whether the device with the IP aaa.bbb.ccc.ddd is reachable and if not after further 30s and a failed ping the network will be restarted. Further improvement in the script could be e.g. to restart the VMs.

Hope this helps someone during the time when the correct device configuration is still not found.

Cheers,
Joe

feelx · Apr 12, 2025

I also had this problem. My newly installed NUC i7 had the same problem and reported ‘Detected Hardware Unit Hang’ several times during the last 5-6 days.

NIC / Kernel

Code:

lspci -nn | grep Ethernet
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (6) I219-V [8086:15be] (rev 30)

uname -a
Linux pve 6.8.12-9-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-9 (2025-03-16T19:18Z) x86_64 GNU/Linux

As MrJoe just wrote, I have now only switched off GSO and TSO in the etc/interfaces file:
post-up ethtool -K eno1 gso off tso off

ChatGPT spit out the following table as ‘Difference’ when comparing the two ethtool outputs.

NIC Offloading Feature Comparison

Feature	Original (Default)	Partially Disabled
rx-checksumming	on	on
tx-checksumming	on	on
tx-checksum-ip-generic	on	on
scatter-gather	on	on
tcp-segmentation-offload	on	off
tx-tcp-segmentation	on	off
tx-tcp6-segmentation	on	off
generic-segmentation-offload	on	off
generic-receive-offload	on	on
rx-vlan-offload	on	on
tx-vlan-offload	on	on

These settings are now running and I will report back.

jimmycav · Apr 13, 2025

jellyfish_jessie3 said:
So 6.8.12-8-pve didn't fix the issue

That's strange - did you have the issue before you upgraded? I previously didn't have the issue on 6.8.12-8-pve, had it frequently (few times a day) after the upgrade to 6.8.12-9-pve, and now that I've pinned 6.8.12-8-pve it's back to being stable.

Is it possible it's a different issue? Have you confirmed you've successfully used the pinned kernel? Could you try roll back to the proxmox version that was stable for you last?

SelfMan · Apr 15, 2025

same Kernel vs Intel NIC issues

L

Thread 'Proxmox 6.8.12-9-pve kernel has introduced a problem with e1000e Driver and network connection lost after some hours'

Apr 1, 2025

Hello All,

Has anybody experienced problems with the enp0s31f6 Intel network card with the new Proxmox VE 6.8.12-9 Kernel?
With this new kernel, the network suddenly stops after some hours. I don't seem to see any relevant information that I can explore.
I don't have the same behavior using the pinned 6.8.12-8-pve Kernel.

Thank you!

e1000 driver hang

New Member

New Member

New Member

New Member

New Member

New Member

Member

New Member

Member

New Member

Proxmox Staff Member

New Member

New Member

Member

New Member

New Member

New Member

New Member

New Member

Member

We value your privacy