One of my Proxmox hosts randomly loses network for 5-6 seconds at what appears to be random points throughout the day. I am usually using VMs via RDP, and they either pause or disconnect.
While investigating the issue I've noticed both pings to and from the host also stop, so it seems to be a host issue rather than a network one.
The timestamps seem arbitrary, but possibly only while I'm using (although I think I've seen it happen overnight too):
On looking at dmesg I see the following around the same time:
My other proxmox host (same HW) doesn't see this issue.
Any thoughts welcome
While investigating the issue I've noticed both pings to and from the host also stop, so it seems to be a host issue rather than a network one.
The timestamps seem arbitrary, but possibly only while I'm using (although I think I've seen it happen overnight too):
Code:
Tue Dec 22 19:04:51 2020: lost
Tue Dec 22 19:05:01 2020: established
Tue Dec 22 19:19:41 2020: lost
Tue Dec 22 19:19:46 2020: established
Tue Dec 22 20:45:23 2020: lost
Tue Dec 22 20:45:28 2020: established
On looking at dmesg I see the following around the same time:
Code:
[778322.252079] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
TDH <e7>
TDT <a4>
next_to_use <a4>
next_to_clean <e6>
buffer_info[next_to_clean]:
time_stamp <10b97e70a>
next_to_watch <e7>
jiffies <10b97efc8>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[778323.499731] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly
[778323.499817] vmbr0: port 1(eth0) entered disabled state
[778327.309860] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[778327.309906] vmbr0: port 1(eth0) entered blocking state
[778327.309908] vmbr0: port 1(eth0) entered forwarding state
My other proxmox host (same HW) doesn't see this issue.
Any thoughts welcome