r8168/9 disconnect after PVE8 upgrade

dannybridi

New Member
Sep 10, 2022
12
0
1
Hi everyone,

I've seen many with similar issues, but no resolution that I could apply (at least I wasn't able to).

I have a test PC that was running fine with PVE7. After the upgrade to PVE8, the network connection would stop after a few hours after reboot. The screen shows loads of messages like:
[xxxxxx.xxxxxx] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10)

All I could do is reboot to reset the connection.

lspci shows that I have a Realtek RTL8111/8168/8411:

root@pve3:~# lspci -nnvmm | egrep -A 6 -B 1 -i 'network|ethernet'
Slot: 01:00.0
Class: Ethernet controller [0200]
Vendor: Realtek Semiconductor Co., Ltd. [10ec]
Device: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [8168]
SVendor: Dell [1028]
SDevice: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [080c]
Rev: 15
ProgIf: 00



I first re-installed the server (from a PVE8 ISO) thinking that the upgrade went wrong, but that did not help.

I followed some threads where I installed the r8168-dkms package, but all that did was completely disconnect the server from the network. So, I re-installed again (I'm not a Linux expert :) ).

Any help in resolving this issue would be appreciated.

Thanks
 
Hello
I also have exactly the same problem.
Disabling "weak by LAN" in the BIOS helped for a while - the server worked much longer, but it still crashed every few days.
An IT specialist advised me this method (version B):

https://community.hetzner.com/tutorials/installing-the-r8168-driver

...however, something went wrong and Proxmox disabled the network card completely. Despite trying, I was unable to reconnect :/
Currently, I managed to install HA without virtualization and I'm testing... but I'd also like to know how to easily solve the problem with "r8169"
Regards
 
Thanks @Tikos.
Strangely enough, the server has not shown any issues since the last re-install, 2.5 days ago! The difference, I should have mentioned before, is that after the last install, I not join this node to a cluster. So, it’s a standalone node. I will wait another day or so before joining it to a cluster. I wonder if that makes any difference.
 
Update: as mentioned above, I left the server non-clustered for a week without any issues. I joined it again yesterday to a cluster (a total of 3 nodes). Sure enough, the server lost network connectivity some time after that. This seems to indicate that the issue is also related to clustering.
 
Last edited:
I haven't had this problem since I installed Home Assistant without Proxmox. However, this still does not solve the problem with Proxmox.
 
Having the same r8169 issue. Non-clustered node. Remoting into MS Server2019 guest with remote desktop appears to trigger it. So far connecting through QEMU hasn't.

Update: It just failed in QEMU as well
 
Last edited:
Same problem here with r8169 and PVE 8.0.

randomly loosing connection, and installing r8168-dkms package doesn't solve the issue, as dannybridi said, it breaks the connection entirely :)
 
Update: as mentioned above, I left the server non-clustered for a week without any issues. I joined it again yesterday to a cluster (a total of 3 nodes). Sure enough, the server lost network connectivity some time after that. This seems to indicate that the issue is also related to clustering.
I'm seeing the issue in non-clustered nodes. Clustering was so unstable in our 20 node environment that we took the opportunity to break it while doing the 7 to 8 upgrade.
 
We only saw this issue on our 6 Dell OptiPlex 3060. At this moment 2 of these are on the r8169 driver and have yet to drop the nic. The remaining 4 are on the r8168 driver. Unfortunately, I discovered today that 2 of these were offline. The other 2 have been solid for almost 2 weeks. The 2 that were offline were also showing an ACPI error. These are the steps I used to roll back to the r8168 driver. Apologizes to the original poster for not crediting them. I can't find their website:



add to /etc/apt/sources.list:

deb http://ftp.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.debian.org/debian bookworm-updates main contrib non-free non-free-firmware

apt update
apt dist-upgrade
apt install pve-headers
apt install r8168-dkms
echo blacklist r8169 >> /etc/modprobe.d/blacklist-r8169.conf
ethtool -i enp1s0
 
  • Like
Reactions: dgallant
I don't want to claim victory yet but I have solved my problem (at least for now) by disabling the power options in my BIOS:

photo_2023-10-12_01-11-47.jpg
 
Hello. I'm having same problems as you all, after some minutes to hours working I cannot connect to the PVE instance or VM's anymore. So I decided to downgrade from PVE 8 to PVE 7 with original r8169 and no efforts on any command line repairing anything and now everything is working now. It is a pity that version 8 is buggy of my DELL MFF 3000 Series i5 9500t. Anyway I stay tuned at this forum conversation when someone really find the
 
We only saw this issue on our 6 Dell OptiPlex 3060. At this moment 2 of these are on the r8169 driver and have yet to drop the nic. The remaining 4 are on the r8168 driver. Unfortunately, I discovered today that 2 of these were offline. The other 2 have been solid for almost 2 weeks. The 2 that were offline were also showing an ACPI error. These are the steps I used to roll back to the r8168 driver. Apologizes to the original poster for not crediting them. I can't find their website:



add to /etc/apt/sources.list:

deb http://ftp.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.debian.org/debian bookworm-updates main contrib non-free non-free-firmware

apt update
apt dist-upgrade
apt install pve-headers
apt install r8168-dkms
echo blacklist r8169 >> /etc/modprobe.d/blacklist-r8169.conf
ethtool -i enp1s0
Thank you for the rundown.

For folks like me, that installed fresh and without a subscription, you have to add/edit sources.list:

# Proxmox VE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription

Otherwise the
Code:
apt install pve-headers
will not work!
 
Last edited:
here's what ours look like. had to add the non-free-ware repo's for the pve-headers to install

/etc/apt/sources.list
deb http://ftp.us.debian.org/debian bookworm main contrib
deb http://ftp.debian.org/debian bookworm main contrib

deb http://ftp.us.debian.org/debian bookworm-updates main contrib
deb http://ftp.debian.org/debian bookworm-updates main contrib

# security updates
deb http://security.debian.org bookworm-security main contrib
deb http://security.debian.org/debian-security bookworm-security main contrib

# not for production
deb http://download.proxmox.com/debian bookworm pve-no-subscription
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription

#non-free-firmware
deb http://ftp.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.debian.org/debian bookworm-updates main contrib non-free non-free-firmware



/etc/apt/sources.list.d/pve-enterprise.list
#deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise


/etc/apt/sources.list.d/ceph.list
#deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise