PVE8 "NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out" fix (to some extent)

tomtom13

Well-Known Member
Dec 28, 2016
62
3
48
42
Hi,
If after an upgrade from 7 to 8, or a fresh install of your interface seem to die randomly (minutes to hours), and only indication you see in your syslog is:
Code:
NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
possibly you've got a Realtek network interface that seem to be affected by this bug.

Backdrop:
At least for me it seem that something has changed in 6.2.16-12-pve from previous versions and driver r8169 seem to be flaky. Before upgrade, I've never even bothered to check which driver those were using - because everything was rock solid. However I've spent two days trying to get to the bottom of this and found an article here that explains how to at least for now get your interfaces working.

Since I don't know what might happen with "medium" article, I will write steps down here for people searching through this forum.

Fix here seems to be at least for time being to use r8168 driver that can be build from "non free" repositories. To do so:

1. check what controller you've got:
Code:
# lspci -nnk | grep -A2 Ethernet
01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
        Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [1028:080c]
        Kernel driver in use: r8169
        Kernel modules: r8169
if it's not physically r8169 and it's using r8169 driver, this seem to be a culprit.

2. Add non free repos to your debian apt repositories:
Code:
# cat /etc/apt/sources.list
deb http://ftp.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.debian.org/debian bookworm-updates main contrib non-free non-free-firmware

3. Update list of available packages, install kernel headers which will allow to build the r8168 driver (it builds during installation), and then install r8168 driver.
Code:
apt update
apt install pve-headers
apt install r8168-dkms

4. reboot

5. check whenever the machine is using a new driver:
Code:
# ethtool -i enp1s0
driver: r8168
version: 8.051.02-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

6. if the driver being used is NOT r8168, may need to black list r8169 but be sure that you have physical access to the machine as you may loose networking alltogether !!! I didn't had to black list it, it just worked for me.
Code:
echo blacklist r8169 >> /etc/modprobe.d/blacklist-r8169.conf

Trouble shooting:
in the article there is a mention that for somebody installation failed on dkms, and they had to do the following:
Code:
dkms build r8168/8.051.02
dkms install r8168/8.051.02
modprobe r8168
systemctl restart networking


To people in charge of 7to8 guide:
Can you please add warning that people with Realtek network cards, using 8169 driver might experience connectivity after the upgrade ?
 
Last edited:
  • Like
Reactions: atp-flo and petgoat
Just wanted to thank you for this guide. I would never have found a solution to this error.
No probs. I myself always skip "medium" articles in search results as it has a lot of buzz words trying to rank it higher in results and rarely has solution to real OS problems ... then I've seen it somewhere somebody pointing to it, and I realised I'm an idiot ... again :D
 
Thank you! I had the problem that my Proxmox just randomly went inaccessible due to this network error. Your guide worked for me*. Hope now I won't lose network connection again.

* after I unsubscribed the enterprise repo
 
Thank you! I had the problem that my Proxmox just randomly went inaccessible due to this network error. Your guide worked for me*. Hope now I won't lose network connection again.

* after I unsubscribed the enterprise repo
FYI, I'm not encouraging to unsubscribe from enterprise repo (just in case admins will misconstrue it). All the machines that I've had contact with with enterprise subscription have intel cards, and only small test (fun) clusters have realtek cards - hence I can't at this point confirm relevance of free/enterprise repos on the problem.


Edit:
Also I can confirm that since week before I made this post - everything seems to be running OK with no network issues AND when updating kernel, the driver automatically gets rebuild as part of an update.

Edit 2:
@apt-flo - can you please provide a bit more detail on the free / enterprise repos that you faced ? Just an idea that somebody might find that helpful, or it may lead to fixing some stuff in enterprise repo if those were the problem.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!