[SOLVED] Fixed: Intermittent network dropout with Intel I218-LM NIC -- e1000e hardware offloading bug

Borris1974 · Mar 24, 2026

Hardware and Software

Proxmox VE 9.1.6
Fujitsu T935
Intel Corporation Ethernet Connection (3) I218-LM (rev 03)
Kernel driver: e1000e
Interface name: nic0 (yours may differ -- typically eno1 or enp3s0)
Guest VM: Any (the fault is at the host NIC level)
Router: Asus XT9 with DHCP reservation

Symptoms

Network connectivity to and from all VMs drops intermittently -- no pattern, could be hours or days between occurrences
The Proxmox host itself also loses connectivity during the dropout
The physical ethernet link light stays green -- the interface shows as UP in the OS
Running ip neigh show dev vmbr0 shows ARP entries going STALE or DELAY for the router
No errors visible in the Proxmox web UI
Running ping to the router or any external host fails silently
SSH sessions drop, Home Assistant becomes unreachable, all VM traffic stops

Temporary fix (what most people discover first)
Physically unplug and replug the ethernet cable. Connectivity restores within seconds.
This works because unplugging the cable forces a hardware reset of the NIC, clearing the hung state. It is not a fix -- the dropout will return.

Root cause
The Intel I218-LM NIC uses the e1000e Linux kernel driver. There is a well-documented bug where hardware offloading features cause the NIC to enter a silent hang state. The kernel logs this as:

e1000e 0000:00:19.0 nic0: Detected Hardware Unit Hang

Check for this after a dropout with:
dmesg | grep -i "hang\|e1000e" | tail -20

The NIC continues to report itself as UP and the link light stays on, which makes this extremely difficult to diagnose. The ARP table going stale is a symptom of the underlying NIC hang, not the root cause.

This affects multiple Intel NIC models using the e1000e driver, including I217-LM, I218-LM, I219-LM and I219-V. It is not specific to any particular VM or workload.

Permanent fix

Step 1 -- Identify your physical NIC name
lspci | grep -i ethernet
ip link show

Note the interface that is the bridge port for vmbr0.

Step 2 -- Check offloading is currently enabled (confirms the issue applies to you)
ethtool -k nic0 | grep -E 'tcp-seg|generic-seg|generic-receive|rx-vlan|tx-vlan|scatter'

If any entries show on, proceed.

Step 3 -- Disable offloading immediately (temporary, to test)
Replace nic0 with your interface name:

ethtool -K nic0 gso off tso off rxvlan off txvlan off gro off tx off rx off sg off

Step 4 -- Make it permanent
Edit /etc/network/interfaces:
nano /etc/network/interfaces

Add a post-up line to your physical NIC stanza. The post-up method is required -- the offload-* directives do not reliably apply on boot:

Code:

auto lo iface lo inet loopback iface nic0 inet manual post-up ethtool -K nic0 gso off tso off rxvlan off txvlan off gro off tx off rx off sg off auto vmbr0 iface vmbr0 inet dhcp bridge-ports nic0 bridge-stp off bridge-fd 0

Reload networking:
ifreload -a

Verify offloading is off after reload:
ethtool -k nic0 | grep -E 'tcp-seg|generic-seg|generic-receive|rx-vlan|tx-vlan|scatter'

All entries should show off.

Step 5 -- Fix invalid ARP responses (secondary fix)
Proxmox can also send ARP replies on the wrong interface, confusing the router. Prevent this permanently:

echo -e "net.ipv4.conf.all.arp_ignore=2\nnet.ipv4.conf.all.arp_announce=2" | tee /etc/sysctl.d/99-proxmox-arp.conf sysctl -p /etc/sysctl.d/99-proxmox-arp.conf

Result
No further network dropouts. The fix survives reboots. No performance impact was observed on a home lab running Home Assistant and other lightweight VMs.

Important AI tools warning: conflicting and wrong diagnoses -- use multiple tools and verify everything

This section is worth reading before you spend hours chasing the wrong fix.

When the symptoms of this fault were put into Microsoft Copilot, it was adamant that the router was the cause. It pointed to the DHCP reservation, the ASUS firmware, and ARP staleness on the router side as the fault. Even after being told that the problem was fixed with above, Copilot continued to insist the router was at fault and suggested router-side fixes.

This is a known risk with AI assistants - they can latch onto a plausible-sounding diagnosis early and then defend it even when contradicting evidence is provided. In networking faults especially, where symptoms like ARP staleness and dropout can have many different root causes, this kind of confirmation bias in an AI response can send you in completely the wrong direction and waste a significant amount of time.

The fault was ultimately diagnosed correctly by using Claude (Anthropic) to analyse the raw ARP table output, the interface names, the NIC hardware details, and the specific symptom of cable-replug restoring connectivity. That combination of clues pointed specifically to the e1000e hardware offloading hang rather than any router or ARP configuration issue.

The lesson here is practical:

Do not rely on a single AI tool for complex technical diagnosis
Provide raw command output rather than describing symptoms in plain language -- AI tools reason much more accurately from actual data
If an AI diagnosis does not match what you are observing, try a different tool with the same data
Cross-reference any AI suggestion against the relevant community forums (in this case the Proxmox forum, where this exact bug is documented across multiple threads)
AI tools are genuinely useful for this kind of diagnosis but they are not infallible -- treat their output as a starting point for investigation, not a final answer

In this case the correct fix was found, confirmed, and is now running stably. But the wrong diagnosis from one AI tool could easily have led to unnecessary router replacements, firmware changes, or hours of network reconfiguration that would have had no effect whatsoever.

jsabater · Apr 30, 2026

Thanks for your post, @Borris1974. It was very helpful.

For future reference, for our Promox 7.4 cluster using dedicated servers and vSwitches on Hetzner, with multiple vmbrX bridges, we used the following /etc/sysctl.d/99-proxmox-arp.conf:

Code:

# Only reply to ARP requests on the interface that owns the target IP
net.ipv4.conf.all.arp_ignore=1
# Always use the source IP that belongs to the outgoing interface when sending ARP requests
net.ipv4.conf.all.arp_announce=2

Our /etc/network/interfaces configuration was the same:

Code:

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

iface lo inet loopback
auto eno1
iface eno1 inet static
    hwaddress aa:bb:cc:dd:ee:ff
    address aaa.bb.cc.dd/26
    gateway aaa.bb.cc.ee
    pointopoint aaa.bb.cc.ee
    post-up ethtool -K eno1 tso off gso off gro off rx off tx off rxvlan off txvlan off sg off
# Proxmox host

iface eno1.4001 inet manual
    mtu 1400

auto vmbr4001
iface vmbr4001 inet manual
    bridge-ports eno1.4001
    bridge-stp off
    bridge-fd 0
    mtu 1400
    bridge-disable-mac-learning 1
# Proxmox guests public network

iface eno1.4002 inet manual
    mtu 1400

auto vmbr4002
iface vmbr4002 inet static
    bridge-ports eno1.4002
    bridge-stp off
    bridge-fd 0
    mtu 1400
# Proxmox guests private network

auto eno1.4003
iface eno1.4003 inet static
    address 192.168.1.19/24
    vlan-raw-device eno1
    mtu 1400
# Proxmox hosts private network

Search

Search

[SOLVED] Fixed: Intermittent network dropout with Intel I218-LM NIC -- e1000e hardware offloading bug

Borris1974

New Member

jsabater

Active Member

We value your privacy