Proxmox Host loses connection on management nic, reboot fixes it

zehetal · Apr 2, 2026

We are using a simple Proxmox 8.4.x cluster with different hardware on the host. Since the last update we have an issue with the management nic on our primary Linux bridge vmbr0.
This bridge ist used for management and the primary nic on our VM´s. From working several days without any problem to working several hours we have this issue, the NIC goes offline (not physical as our network administrator is arguing). After rebooting the system, the NIC is working again for mostly several days.
In the hosts System log there is this kernel message:

Apr 02 10:02:38 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
TDH <9c>
TDT <d3>
next_to_use <d3>
next_to_clean <9b>
buffer_info[next_to_clean]:
time_stamp <113a03870>
next_to_watch <9c>
jiffies <113a30500>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>

According to the vendor Intel the NIC seems physically to be OK.
Does anyone know such issue about the Intel nics with Proxmox 8.4.x
All other network connections on the host (we have several of them) are working, but if we change the vmbr0 to another NIC the reboot doesn´t fix the issue anymore and losing connection stays on

PaddraighNet · Apr 2, 2026

That kernel message is an e1000e "Hardware Unit Hang" - the transmit descriptor ring gets stuck (TDH/TDT mismatch means the hardware head pointer is behind the tail, so it stopped consuming packets). The driver detects this and tries to recover but sometimes cannot without a full reset, hence the reboot fixes it.

This is a known issue with the Intel I219-V/LM series and has resurfaced with newer kernel versions. Two things worth trying:

First, disable Energy Efficient Ethernet (EEE) on that NIC - the I219 is notorious for going into a low-power state and not coming back cleanly:

ethtool --set-eee enp0s31f6 eee off

To make this persistent, add it as a post-up rule in /etc/network/interfaces under vmbr0:

post-up ethtool --set-eee enp0s31f6 eee off

Second, if EEE is already off or that does not help, create /etc/modprobe.d/e1000e.conf with:

options e1000e SmartPowerDownEnable=0

Then run: update-initramfs -u -k all

To rule out interrupt coalescing as the cause, also try: ethtool -C enp0s31f6 rx-usecs 0 tx-usecs 0

A few useful commands to share the full picture if the problem continues: pveversion -v, uname -r, ethtool -i enp0s31f6 (shows driver and firmware versions), and dmesg | grep -E "e1000e|enp0s31f6" from around the time it hangs.

The fact that switching vmbr0 to another NIC and rebooting still does not help suggests the issue might be specifically on that physical port or the I219 chip itself - what NIC model is it?

zehetal · Apr 2, 2026

PaddraighNet said:
That kernel message is an e1000e "Hardware Unit Hang" - the transmit descriptor ring gets stuck (TDH/TDT mismatch means the hardware head pointer is behind the tail, so it stopped consuming packets). The driver detects this and tries to recover but sometimes cannot without a full reset, hence the reboot fixes it.

This is a known issue with the Intel I219-V/LM series and has resurfaced with newer kernel versions. Two things worth trying:

First, disable Energy Efficient Ethernet (EEE) on that NIC - the I219 is notorious for going into a low-power state and not coming back cleanly:

ethtool --set-eee enp0s31f6 eee off

To make this persistent, add it as a post-up rule in /etc/network/interfaces under vmbr0:

post-up ethtool --set-eee enp0s31f6 eee off

Second, if EEE is already off or that does not help, create /etc/modprobe.d/e1000e.conf with:

options e1000e SmartPowerDownEnable=0

Then run: update-initramfs -u -k all

To rule out interrupt coalescing as the cause, also try: ethtool -C enp0s31f6 rx-usecs 0 tx-usecs 0

A few useful commands to share the full picture if the problem continues: pveversion -v, uname -r, ethtool -i enp0s31f6 (shows driver and firmware versions), and dmesg | grep -E "e1000e|enp0s31f6" from around the time it hangs.

The fact that switching vmbr0 to another NIC and rebooting still does not help suggests the issue might be specifically on that physical port or the I219 chip itself - what NIC model is it?

I will work out all propositions made by you. Since the error takes place after several days, i will give advice if it works in few days.

ce3rd · Apr 3, 2026

-> https://forum.proxmox.com/threads/e1000-driver-hang.58284/

zehetal · Apr 8, 2026

We have this issue again and again.
Here are the needed further informations:

pveversion -v ->
proxmox-ve: 8.4.0 (running kernel: 6.8.12-20-pve)
pve-manager: 8.4.17 (running version: 8.4.17/c8c39014680186a7)
proxmox-kernel-helper: 8.1.4
proxmox-kernel-6.8.12-20-pve-signed: 6.8.12-20
proxmox-kernel-6.8: 6.8.12-20
proxmox-kernel-6.8.12-19-pve-signed: 6.8.12-19
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
ceph-fuse: 17.2.8-pve2
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.2
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.2
libpve-cluster-perl: 8.1.2
libpve-common-perl: 8.3.7
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.3
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.7
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-2
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.7-1
proxmox-backup-file-restore: 3.4.7-1
proxmox-backup-restore-image: 0.7.0
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.4
proxmox-mail-forward: 0.3.3
proxmox-mini-journalreader: 1.5
proxmox-offline-mirror-helper: 0.6.8
proxmox-widget-toolkit: 4.3.16
pve-cluster: 8.1.2
pve-container: 5.3.3
pve-docs: 8.4.1
pve-edk2-firmware: 4.2025.02-4~bpo12+1
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.2
pve-firmware: 3.16-3
pve-ha-manager: 4.0.7
pve-i18n: 3.4.5
pve-qemu-kvm: 9.2.0-7
pve-xtermjs: 5.5.0-2
qemu-server: 8.4.5
smartmontools: 7.3-pve1
spiceterm: 3.3.1
swtpm: 0.8.0+pve1
vncterm: 1.8.1
zfsutils-linux: 2.2.9-pve1

uname -r ->
6.8.12-20-pve

ethtool -i enp0s31f6 ->
driver: e1000e
version: 6.8.12-20-pve
firmware-version: 2.5-4
expansion-rom-version:
bus-info: 0000:00:1f.6
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

dmesg | grep -E "e1000e|enp0s31f6" ->
....
Apr 08 15:59:52 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
Apr 08 15:59:54 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
Apr 08 15:59:56 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
Apr 08 15:59:58 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
Apr 08 16:00:00 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
...

The stating point brings the error from around 15:52:00 in two second steps. We have a coincindence with a downsampling test of a influx DB in an VM on top of the Intel Nic with virtIO drivers on Ubuntu 22.04 LTS.

zehetal · Apr 8, 2026

Seconds before the intel NIC goes down, we get the following messages in the system log:

Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] link: host: 1 link: 0 is down
Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] link: host: 3 link: 0 is down
Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] host: host: 1 has no active links
Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Apr 08 15:33:45 prox3 corosync[1353]: [KNET ] host: host: 3 has no active links
Apr 08 15:33:46 prox3 corosync[1353]: [TOTEM ] Token has not been received in 2737 ms
Apr 08 15:33:46 prox3 kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:

Proxmox Host loses connection on management nic, reboot fixes it

zehetal

Member

PaddraighNet

New Member

zehetal

Member

ce3rd

Renowned Member

zehetal

Member

zehetal

Member

We value your privacy