Proxmox 6.8.12-9-pve kernel has introduced a problem with e1000e Driver and network connection lost after some hours

leandrosgf · May 1, 2025

tonarmshq474 said:
Yes. Just recognized as the issue returned. But I don´t have this as a backup. Just .10 and .9 and an older 5.x....Kernel, but will this be compatible?

You can try to pin the .8 version and see if its still there. I was facing the same problem seeing .9 and .10 versions available, so I installed the .8 version manually and it's working fine ( seems to be compatible but needs to keep running for the next hours or days to confirm there is no compatible issues.

By the way, if there is no compatible issues today, we have no guarantee that it could happen further, so that why we need to have someone looking into ASAP.

Silas95 · May 2, 2025

Has someone tried Kernel 6.14 which is contained in Proxmox 8.4 as opt in?

fabricionaweb · May 5, 2025

My nic is l219-lm from a thinkcentre m920q and I was facing same issue. Rolled back to .8 seems to be fixed.
That was a nightmare to debug lol

timnis · May 5, 2025

Silas95 said:
Has someone tried Kernel 6.14 which is contained in Proxmox 8.4 as opt in?

I running Linux 6.14.0-2-pve at the moment but it has same problem as 6.8.12-10

I have

Code:

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (11) I219-LM
        DeviceName: Onboard Lan
        Subsystem: Hewlett-Packard Company Ethernet Connection (11) I219-LM
        Flags: bus master, fast devsel, latency 0, IRQ 125, IOMMU group 8
        Memory at e1200000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: e1000e
        Kernel modules: e1000e

leandrosgf · May 5, 2025

timnis said:

I running Linux 6.14.0-2-pve at the moment but it has same problem as 6.8.12-10

I have

Code:

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (11) I219-LM
        DeviceName: Onboard Lan
        Subsystem: Hewlett-Packard Company Ethernet Connection (11) I219-LM
        Flags: bus master, fast devsel, latency 0, IRQ 125, IOMMU group 8
        Memory at e1200000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Yep, it seems to be something with all NICs using e1000e.
This is mine:

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (17) I219-LM (rev 11)
Subsystem: Dell Ethernet Connection (17) I219-LM
Flags: bus master, fast devsel, latency 0, IRQ 124, IOMMU group 9
Memory at 70500000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [c8] Power Management version 3
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Kernel driver in use: e1000e
Kernel modules: e1000e

I have changed the subject of this topic to cover this better and maybe atract someone that knows what is happening and a possible fix or workaround using Kernels from 6.8.12-9 or above

fabricionaweb · May 5, 2025

Just change the kernel didnt solve.

I found more threads and now I'm trying the ethtool to disable some offload stuff

M

[SOLVED] Thread 'Intel NIC e1000e hardware unit hang'

Mar 5, 2022

This has been discussed many times, is there a solution/workaround yet, as the specified steps are not working for me?
I've the following in /etc/network/interfaces

Code:

auto lo
iface lo inet loopback

iface eno1 inet manual
        offload-gso off
        offload-gro off
        offload-tso off
        offload-rx off
        offload-tx off
        offload-rxvlan off
        offload-txvlan off
        offload-sg off
        offload-ufo off
        offload-lro off
auto vmbr0
iface vmbr0 inet static
        address xxx
        gateway xxx
        bridge-ports eno1
        bridge-stp off...

Hones I wasnt planning to use this nic, so much trouble

Markku · May 5, 2025

fabricionaweb said:
My nic is l219-lm from a thinkcentre m920q and I was facing same issue. Rolled back to .8 seems to be fixed.
That was a nightmare to debug lol

I have M920q with the same onboard NIC, with 6.8.12-10, haven't seen any issues yet (running for 8-9 months, with a couple of PVE kernels).

I'm running Zabbix monitoring on that host+NIC so if there are problems with the network, I will surely see them.

fabricionaweb · May 6, 2025

Markku said:
I have M920q with the same onboard NIC, with 6.8.12-10, haven't seen any issues yet (running for 8-9 months, with a couple of PVE kernels).

I'm running Zabbix monitoring on that host+NIC so if there are problems with the network, I will surely see them.

Thats nice, we can see many reports here and in the other thread. Maybe is something in the setup.
Here I have a bridge and I pass the bridge down to a VM where I use it as WAN. It was most problematic to me when doing uploads.

After running the ethtool -K eno1 gso off gro off tso off tx off rx off rxvlan off txvlan off to turn off (a lot of) features on a kernel 6.8.12-8, it seems much more stable now. I spammed 100gb over iperf and so far no disconnections \o/

I will try now with less features off (just `tso off gso off gro off`) and maybe retry with the newer kernel

timnis · May 8, 2025

fabricionaweb said:
I will try now with less features off (just `tso off gso off gro off`) and maybe retry with the newer kernel

I tried and it worked or at least have been worked 24 hours

Last 24 hours has not drop connection a connection and no problem also in system log.

My kernel is Linux 6.14.0-2-pve and I used following

Code:

ethtool -K eno1 gso off gro off tso off

fabricionaweb · May 8, 2025

I had one issue here disabling only those, I notice after a while my link went down to 100Mbps.
Im testing disabling everything now, because it can still be a fault cable for instance.

timnis · May 8, 2025

fabricionaweb said:
I had one issue here disabling only those, I notice after a while my link went down to 100Mbps.
Im testing disabling everything now, because it can still be a fault cable for instance.

Ok, for me it stayed on gigabit, just cheked from Mikrotik hAP ax^2

leandrosgf · May 13, 2025

timnis said:
I tried and it worked or at least have been worked 24 hours
Last 24 hours has not drop connection a connection and no problem also in system log.

My kernel is Linux 6.14.0-2-pve and I used following

Code:

ethtool -K eno1 gso off gro off tso off

It seems to be promisse. @timnis Do you know exactly what we are disabling with this? Is your system still running fine after some days? I will try to do it and update to the latest kernel to see what happens.

fabricionaweb · May 14, 2025

Sorry I just saw now that you tagged timnis, my bad.

leandrosgf said:
It seems to be promisse. @timnis Do you know exactly what we are disabling with this? Is your system still running fine after some days? I will try to do it and update to the latest kernel to see what happens.

They are described on the man page https://linux.die.net/man/8/ethtool (-K).

rx on|off
Specifies whether RX checksumming should be enabled.
tx on|off
Specifies whether TX checksumming should be enabled.
sg on|off
Specifies whether scatter-gather should be enabled.
tso on|off
Specifies whether TCP segmentation offload should be enabled.
ufo on|off
Specifies whether UDP fragmentation offload should be enabled
gso on|off
Specifies whether generic segmentation offload should be enabled
gro on|off
Specifies whether generic receive offload should be enabled
lro on|off
Specifies whether large receive offload should be enabled
rxvlan on|off
Specifies whether RX VLAN acceleration should be enabled
txvlan on|off
Specifies whether TX VLAN acceleration should be enabled
ntuple on|off
Specifies whether Rx ntuple filters and actions should be enabled
rxhash on|off
Specifies whether receive hashing offload should be enabled

In my case, Just by disabling tso,ufo,gso,gro it did not worked well and I still have the hang logs

May 14 12:32:02 proxmox kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:

But after disabled them all with ethtool -K eno1 tso off ufo off gso off gro off rx off rxvlan off tx off txvlan off rxhash off

It have worked much better during weeks. I added the post-up to /etc/network/interfaces

Code:

iface eno1 inet manual
        post-up /sbin/ethtool -K eno1 tso off ufo off gso off gro off rx off rxvlan off tx off txvlan off rxhash off

timnis · May 16, 2025

for a while it look liked it worked but no....

With latest kernel and ethtool -K eno1 gso off gro off tso off it worked some how. In syslog there was no error about eno1.
But I have a secondary PBS installed to VM which pulls backup from primary PSB (form another site) over OpenZiti overlay network and it worked really bad.

At the moment I'm back top kernel 6.8.12-8 and it works without any problem. When I have more time I try to test agailn with ethtool with different parameters... Or hopefully a new kernel version fi the problem

MALEFX · May 21, 2025

news?

jacky0815 · May 23, 2025

Hello, I'm having the same problem. However, when I run ethtools and check it afterwards, it doesn't seem to have applied the values.How can I check if the values were applied correctly by ethtools?

Code:

root@pve1:~# ethtool -a ens2f0
Pause parameters for ens2f0:
Autonegotiate:  on
RX:             on
TX:             on
RX negotiated: on
TX negotiated: on

root@pve1:~# /sbin/ethtool -K ens2f0 tso off ufo off gso off gro off rx off rxvlan off tx off txvlan off rxhash off
root@pve1:~# ethtool -a ens2f0
Pause parameters for ens2f0:
Autonegotiate:  on
RX:             on
TX:             on
RX negotiated: on
TX negotiated: on

root@pve1:~#

Zaphood · May 24, 2025

Hi,

as these parameters don't survive a reboot, can somebody maybe provide an interfaces file that sets these parameters? I am not that much of a Linux guy to do this on my own, and the results from AI are different, every time I let them create such a file.... not very reassuring that these will work then ;-)

Thanks a lot
Frank

MarkusKo · May 24, 2025

@Zaphood here is an example

Code:

auto lo
iface lo inet loopback

auto enp0s25
iface enp0s25 inet manual
        post-up /usr/sbin/ethtool -K $IFACE gso off tso off gro off 2> /dev/null

auto vmbr0
iface vmbr0 inet static

look here or here

skello · May 25, 2025

I can confirm that going back to the 6.8.12-8 kernel only partially fixed the issue, in that the hardware hang on the NIC still appears to happen, but the kernel recovered it on its own, at least so far. Let's see how it behaves going forward and if I also need the ethtool fix. From the system log;

UPDATE: Just downgrading the kernel didn't fix the issue. Had to use the ethtool workaround and add it to post-up. So far so good for 2 days.

Code:

May 25 13:04:47 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
  TDH                  <93>
  TDT                  <c1>
  next_to_use          <c1>
  next_to_clean        <92>
buffer_info[next_to_clean]:
  time_stamp           <fffd886e>
  next_to_watch        <93>
  jiffies              <fffd91c0>
  next_to_watch.status <0>
MAC Status             <80083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3800>
PHY Extended Status    <3000>
PCI Status             <10>
May 25 13:04:49 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
  TDH                  <93>
  TDT                  <c1>
  next_to_use          <c1>
  next_to_clean        <92>
buffer_info[next_to_clean]:
  time_stamp           <fffd886e>
  next_to_watch        <93>
  jiffies              <fffd99c0>
  next_to_watch.status <0>
MAC Status             <80083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3800>
PHY Extended Status    <3000>
PCI Status             <10>
May 25 13:04:51 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
  TDH                  <93>
  TDT                  <c1>
  next_to_use          <c1>
  next_to_clean        <92>
buffer_info[next_to_clean]:
  time_stamp           <fffd886e>
  next_to_watch        <93>
  jiffies              <fffda180>
  next_to_watch.status <0>
MAC Status             <80083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3800>
PHY Extended Status    <3000>
PCI Status             <10>
May 25 13:04:53 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
  TDH                  <93>
  TDT                  <c1>
  next_to_use          <c1>
  next_to_clean        <92>
buffer_info[next_to_clean]:
  time_stamp           <fffd886e>
  next_to_watch        <93>
  jiffies              <fffda940>
  next_to_watch.status <0>
MAC Status             <80083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3800>
PHY Extended Status    <3000>
PCI Status             <10>
May 25 13:04:55 pve kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
  TDH                  <93>
  TDT                  <c1>
  next_to_use          <c1>
  next_to_clean        <92>
buffer_info[next_to_clean]:
  time_stamp           <fffd886e>
  next_to_watch        <93>
  jiffies              <fffdb100>
  next_to_watch.status <0>
MAC Status             <80083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3800>
PHY Extended Status    <3000>
PCI Status             <10>
May 25 13:04:56 pve kernel: e1000e 0000:00:19.0 enp0s25: NETDEV WATCHDOG: CPU: 1: transmit queue 0 timed out 9282 ms
May 25 13:04:56 pve kernel: e1000e 0000:00:19.0 enp0s25: Reset adapter unexpectedly
May 25 13:04:56 pve kernel: vmbr0: port 1(enp0s25) entered disabled state
May 25 13:05:00 pve kernel: e1000e 0000:00:19.0 enp0s25: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
May 25 13:05:00 pve kernel: vmbr0: port 1(enp0s25) entered blocking state
May 25 13:05:00 pve kernel: vmbr0: port 1(enp0s25) entered forwarding state

Zaphood · May 26, 2025

Well, would be great if somebody from the Proxmox Team would take the time to look into this... ?

Proxmox 6.8.12-9-pve kernel has introduced a problem with e1000e Driver and network connection lost after some hours

New Member

New Member

New Member

Member

New Member

New Member

[SOLVED] Thread 'Intel NIC e1000e hardware unit hang'

Member

New Member

Member

New Member

Member

New Member

New Member

Member

Member

Renowned Member

Member

Renowned Member

Active Member

Member

We value your privacy