Single port on NIC randomly disconnecting

Sakreton

New Member
Sep 21, 2024
4
0
1
Hey,

i am running a 3 node PVE + Ceph Cluster on 3x Minesforum MS-A2.
The nodes are running identical hardware.

On one of the 3 nodes

At supposedly random times one 2.5Gbit Port on a RTL8125 2.5GbE Controller (rev 05) NIC disconnects.
The affected port is used for the guest traffic thats probably the only reason why i even noticed it in the first place lol.
All other NICs (2x Ceph), even the other port (used for Corosync to a different switch) on this NIC are not affected.

This also never occured on any other node.

I have tried a different Switch-Port and entire switch, its the host nic not the switch side.
I have also obviously rebooted this node.

I am running PVE 9.2.2, the problem persists for some time already since troubleshooting is a bit difficult since i can't force the problem to appear.

Any ideas before i approach minisforum any help or ideas are greatly appreciated! Thanks in advance and for reading this! :>

Proxmox Log:
May 27 20:07:44 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Down
May 27 20:07:44 prox-ganymede kernel: vmbr0: port 1(nic0) entered disabled state
May 27 20:07:52 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Up - 2.5Gbps/Full - flow control off
May 27 20:07:52 prox-ganymede kernel: vmbr0: port 1(nic0) entered blocking state
May 27 20:07:52 prox-ganymede kernel: vmbr0: port 1(nic0) entered forwarding state
May 27 20:07:54 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Down
May 27 20:07:54 prox-ganymede kernel: vmbr0: port 1(nic0) entered disabled state
May 27 20:07:55 prox-ganymede pvestatd[1566]: PBS_VM: error fetching datastores - 500 Can't connect to PBS:8007 (Connection timed out)
May 27 20:07:55 prox-ganymede pvestatd[1566]: status update time (7.093 seconds)
May 27 20:07:56 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Up - 1Gbps/Full - flow control off
May 27 20:07:56 prox-ganymede kernel: vmbr0: port 1(nic0) entered blocking state
May 27 20:07:56 prox-ganymede kernel: vmbr0: port 1(nic0) entered forwarding state
May 27 20:08:00 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Down
May 27 20:08:00 prox-ganymede kernel: vmbr0: port 1(nic0) entered disabled state
May 27 20:08:04 prox-ganymede kernel: r8169 0000:03:00.0 nic0: Link is Up - 2.5Gbps/Full - flow control off
May 27 20:08:04 prox-ganymede kernel: vmbr0: port 1(nic0) entered blocking state
May 27 20:08:04 prox-ganymede kernel: vmbr0: port 1(nic0) entered forwarding state

Switch Log:
2026-05-27 20:07:41 interface,info ether3(TO_GANYMEDE) link down
2026-05-27 20:07:49 interface,info ether3(TO_GANYMEDE) link up (speed 2.5G, full duplex)
2026-05-27 20:07:51 interface,info ether3(TO_GANYMEDE) link down
2026-05-27 20:07:52 interface,info ether3(TO_GANYMEDE) link up (speed 2.5G, full duplex)
2026-05-27 20:07:55 interface,info ether3(TO_GANYMEDE) link down
2026-05-27 20:08:02 interface,info ether3(TO_GANYMEDE) link up (speed 2.5G, full duplex)
 
The r8169 network driver is notorious in Linux and Proxmox for causing exactly this type of flapping behavior under heavy load or bridging.

apt update
apt install -y pve-headers pve-kernel-helper pve-edk2-firmware dkms r8168-dkms

install differnt drivers
 
  • Like
Reactions: Sakreton
If it's just one out of three you might have a bad patch cable. Did you also try another patch cable?
Also, Minisforum boxes are known to thermal issues. Maybe this particular system gets hotter than the other two?
 
  • Like
Reactions: Sakreton
Thanks for the possible solutions!
I am trying them now, i will update this Post as soon as i know if something worked^^