System hanging after upgrade...NIC driver?

I've tried installing r8168-dkms as above but it doesn't work properly: I get a zillion messages like this when I up the device:
Code:
[Mon Jun 26 12:05:49 2023] r8168 0000:01:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[Mon Jun 26 12:05:49 2023] r8168 0000:01:00.0:   device [10ec:8168] error status/mask=00100000/00400000
[Mon Jun 26 12:05:49 2023] r8168 0000:01:00.0:    [20] UnsupReq               (First)
[Mon Jun 26 12:05:49 2023] r8168 0000:01:00.0: AER:   TLP Header: 40000001 00000003 8050403c 00000000
[Mon Jun 26 12:05:49 2023] r8168 0000:01:00.0: AER: can't recover (no error_detected callback)
[Mon Jun 26 12:05:49 2023] pcieport 0000:00:1c.0: AER: device recovery failed
[Mon Jun 26 12:05:49 2023] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:01:00.0
Curiously though I can assign an address and ssh into it.

I found this https://bbs.archlinux.org/viewtopic.php?id=285421 which coincidentally was using the same mini-pc as mine.

As suggested I added r8168.aspm=0 r8168.eee_enable=0 pcie_aspm=off loglevel=3 to kernal args in grub and it's all gone very quiet and seems to work. I did this by rebooting but I would think setting appropriate module options would achieve the same without having to restart (if only I knew how).
 
Fairly new user to Proxmox here, having run 7.4 for a while and deciding to try the update to 8. Initially the update seemed to work fine and my VM ran ok, but the following day the system had hung and upon connecting a monitor to the host, I could see several lines relating to my Realtek NIC - similar to the other pictures posted earlier in this thread. My system is a Dell 3060 with Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller.
No issues at all running PVE 7.4. Has anyone raised this as a bug report yet? Had a search but couldn't see anything when searching r8169, but haven't used that site before and I probably don't have the required knowledge to report it with all required relevant information etc
 
Same issue here after upgrading to Proxmox VE 8. Hangs once a day, must be rebooted to get it working again.
Activated r8168 driver for now.

Realtek NIC:
Code:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
        Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
 
Last edited:
Hi,
If anyone has some spare time for some help I’d greatly appreciate some.
I added the non-free repo and then was able to install r8168 seemingly fine, however modprobe r8168 tells me that there is no such module, I have purged and try reinstalling a couple of times but no luck. I have noticed that the logs in the apt install do say “Module build for kernel 6.2.16-3-pve skipped since the kernel headers for this kernel do not seem to be installed”, but I don’t seem to be able to install those either.
Thanks

Edit:
After fixing typo in non-register repo I was able to get r8168 installed, unfortunately enp1s0 is no longer found and network is down.

Edit 2:
After a good turning it off and on again it is now working fine, thanks for all the comments here that helped me along the way :)
 
Last edited:
The following repositories must be active to install the r8168-dkms driver package:

Code:
deb http://ftp.ca.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.ca.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
deb http://security.debian.org bookworm-security main contrib non-free non-free-firmware
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription

The following configuration (which is applied by installing package r8168-dkms) takes care of loading the r8168 driver instead of the r8169 driver.
That's why it requires a reboot.

Code:
root@proxmox04:/etc/modprobe.d# cat r8168-dkms.conf
# settings for r8168-dkms

# map the specific PCI IDs instead of blacklisting the whole r8169 module
alias   pci:v00001186d00004300sv00001186sd00004B10bc*sc*i*      r8168
alias   pci:v000010ECd00008168sv*sd*bc*sc*i*                    r8168

# if the aliases above do not work, uncomment the following line
# to blacklist the whole r8169 module
#blacklist r8169

It's running perfectly fine for a few days now, no hickups, crashes or other errors anymore.
 
Last edited:
I have a similar issue, but the 8169 firmware fails completely. I now have no network connection at all. Does anyone have suggestions on how to install the r8168-dkms package without any network connection?

I tried installing the package from a usb stick, but that didn't work, maybe because of dependencies? The system is still using r8169
 
Last edited:
I have a similar issue, but the 8169 firmware fails completely. I now have no network connection at all. Does anyone have suggestions on how to install the r8168-dkms package without any network connection?

I tried installing the package from a usb stick, but that didn't work, maybe because of dependencies? The system is still using r8169
Since r8169 works fine under kernel 5.15.108-1, you can start PVE with kernel 5.15 and then install r8168-dkms online.
 
Same issue here--I just upgraded one of my nodes with a Realtek NIC to 8.03 and was bitten by this bug:
Code:
Jul 24 20:37:56 thinmox kernel: ------------[ cut here ]------------
Jul 24 20:37:56 thinmox kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Jul 24 20:37:56 thinmox kernel: WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x23a/0x250
Jul 24 20:37:56 thinmox kernel: Modules linked in: cmac nls_utf8 cifs cifs_arc4 rdma_cm iw_cm ib_cm ib_core cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter sctp ip6_udp_tunnel udp_tunnel scsi_transport_iscsi nf_tables 8021q garp mrp bonding tls softdog sunrpc nfnetlink_log nfnetlink binfmt_misc snd_hda_codec_hdmi snd_sof_pci_intel_apl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils soundwire_bus snd_soc_avs intel_rapl_msr snd_hda_codec_realtek intel_rapl_common snd_soc_hda_codec intel_pmc_bxt snd_hda_codec_generic intel_telemetry_pltdrv intel_punit_ipc snd_soc_skl intel_telemetry_core snd_soc_hdac_hda x86_pkg_temp_thermal intel_powerclamp snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi kvm_intel snd_soc_core mei_pxp mei_hdcp
Jul 24 20:37:56 thinmox kernel:  snd_compress ac97_bus snd_pcm_dmaengine i915 kvm snd_hda_intel dell_wmi irqbypass ledtrig_audio drm_buddy crct10dif_pclmul polyval_generic snd_intel_dspcfg ghash_clmulni_intel snd_intel_sdw_acpi sha512_ssse3 dell_smbios aesni_intel crypto_simd dcdbas ttm snd_hda_codec cryptd rapl sparse_keymap dell_wmi_descriptor intel_cstate wmi_bmof drm_display_helper snd_hda_core cdc_acm ee1004 pcspkr snd_hwdep snd_pcm cec snd_timer rc_core ucsi_acpi snd mei_me typec_ucsi drm_kms_helper typec soundcore mei i2c_algo_bit syscopyarea sysfillrect sysimgblt mac_hid vhost_net vhost vhost_iotlb tap coretemp drm efi_pstore dmi_sysfs ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor raid6_pq libcrc32c simplefb mmc_block sdhci_pci r8169 cqhci xhci_pci ahci xhci_pci_renesas crc32_pclmul sdhci intel_lpss_pci libahci i2c_i801 intel_lpss i2c_smbus idma64 xhci_hcd realtek video wmi pinctrl_geminilake
Jul 24 20:37:56 thinmox kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: P           O       6.2.16-4-pve #1
Jul 24 20:37:56 thinmox kernel: Hardware name: Dell Inc. Wyse 5070 Thin Client/0TKM9Y, BIOS 1.7.1 07/30/2020
Jul 24 20:37:56 thinmox kernel: RIP: 0010:dev_watchdog+0x23a/0x250
Jul 24 20:37:56 thinmox kernel: Code: 00 e9 2b ff ff ff 48 89 df c6 05 4a 6c 7d 01 01 e8 6b 08 f8 ff 44 89 f1 48 89 de 48 c7 c7 98 64 80 a2 48 89 c2 e8 86 a6 30 ff <0f> 0b e9 1c ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00
Jul 24 20:37:56 thinmox kernel: RSP: 0018:ffffac1d0011ce38 EFLAGS: 00010246
Jul 24 20:37:56 thinmox kernel: RAX: 0000000000000000 RBX: ffff9db8d1bf8000 RCX: 0000000000000000
Jul 24 20:37:56 thinmox kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jul 24 20:37:56 thinmox kernel: RBP: ffffac1d0011ce68 R08: 0000000000000000 R09: 0000000000000000
Jul 24 20:37:56 thinmox kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9db8d1bf84c8
Jul 24 20:37:56 thinmox kernel: R13: ffff9db8d1bf841c R14: 0000000000000000 R15: 0000000000000000
Jul 24 20:37:56 thinmox kernel: FS:  0000000000000000(0000) GS:ffff9dbc2fc80000(0000) knlGS:0000000000000000
Jul 24 20:37:56 thinmox kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 24 20:37:56 thinmox kernel: CR2: 00007f3c67ae8000 CR3: 0000000255610000 CR4: 0000000000352ee0
Jul 24 20:37:56 thinmox kernel: Call Trace:
Jul 24 20:37:56 thinmox kernel:  <IRQ>
Jul 24 20:37:56 thinmox kernel:  ? __pfx_dev_watchdog+0x10/0x10
Jul 24 20:37:56 thinmox kernel:  call_timer_fn+0x29/0x160
Jul 24 20:37:56 thinmox kernel:  ? __pfx_dev_watchdog+0x10/0x10
Jul 24 20:37:56 thinmox kernel:  __run_timers+0x259/0x310
Jul 24 20:37:56 thinmox kernel:  run_timer_softirq+0x1d/0x40
Jul 24 20:37:56 thinmox kernel:  __do_softirq+0xd6/0x346
Jul 24 20:37:56 thinmox kernel:  ? hrtimer_interrupt+0x11f/0x250
Jul 24 20:37:56 thinmox kernel:  __irq_exit_rcu+0xa2/0xd0
Jul 24 20:37:56 thinmox kernel:  irq_exit_rcu+0xe/0x20
Jul 24 20:37:56 thinmox kernel:  sysvec_apic_timer_interrupt+0x92/0xd0
Jul 24 20:37:56 thinmox kernel:  </IRQ>
Jul 24 20:37:56 thinmox kernel:  <TASK>
Jul 24 20:37:56 thinmox kernel:  asm_sysvec_apic_timer_interrupt+0x1b/0x20
Jul 24 20:37:56 thinmox kernel: RIP: 0010:cpuidle_enter_state+0xde/0x6f0
Jul 24 20:37:56 thinmox kernel: Code: 27 57 5e e8 d4 79 4a ff 8b 53 04 49 89 c7 0f 1f 44 00 00 31 ff e8 02 82 49 ff 80 7d d0 00 0f 85 eb 00 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 12 02 00 00 4d 63 ee 49 83 fd 09 0f 87 c7 04 00 00
Jul 24 20:37:56 thinmox kernel: RSP: 0018:ffffac1d000cfe38 EFLAGS: 00000246
Jul 24 20:37:56 thinmox kernel: RAX: 0000000000000000 RBX: ffff9dbc2fcbd900 RCX: 0000000000000000
Jul 24 20:37:56 thinmox kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Jul 24 20:37:56 thinmox kernel: RBP: ffffac1d000cfe88 R08: 0000000000000000 R09: 0000000000000000
Jul 24 20:37:56 thinmox kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa32c33a0
Jul 24 20:37:56 thinmox kernel: R13: 0000000000000004 R14: 0000000000000004 R15: 000007aa5d5e4b8f
Jul 24 20:37:56 thinmox kernel:  ? cpuidle_enter_state+0xce/0x6f0
Jul 24 20:37:56 thinmox kernel:  cpuidle_enter+0x2e/0x50
Jul 24 20:37:56 thinmox kernel:  do_idle+0x216/0x2a0
Jul 24 20:37:56 thinmox kernel:  cpu_startup_entry+0x1d/0x20
Jul 24 20:37:56 thinmox kernel:  start_secondary+0x122/0x160
Jul 24 20:37:56 thinmox kernel:  secondary_startup_64_no_verify+0xe5/0xeb
Jul 24 20:37:56 thinmox kernel:  </TASK>
Jul 24 20:37:56 thinmox kernel: ---[ end trace 0000000000000000 ]---
 
After reducing the PVE8 kernel to 5.15.108-1, the network adapter(r8169) works fine.

Code:
root@PVE:~# uname -a
Linux PVE 5.15.108-1-pve #1 SMP PVE 5.15.108-1 (2023-06-17T09:41Z) x86_64 GNU/Linux

root@PVE:~# uptime
04:37:15 up 18:56,  1 user,  load average: 0.22, 0.27, 0.29

root@PVE:~# lsmod | grep 816
r8169                 102400  0

How did you get 5.15 installed on Proxmox 8.0?
 
After upgrading to 8 accessing my pi-hole web interface would cause my entire system to crash requiring a hard reset. The pi-hole would still work for DNS. When looking at the syslog I see:
Jun 22 16:15:59 prox kernel: NETDEV WATCHDOG: enp6s0 (r8169): transmit queue 0 timed out
Jun 22 16:15:59 prox kernel: WARNING: CPU: 26 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x23a/0x250....'a bunch of information'
....Jun 22 16:16:21 prox kernel: r8169 0000:06:00.0 enp6s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).

The pi-hole was using a vlan that was plugged into the 2.5G port on my motherboard. After changing to different NIC the problem stopped. Is this a bug with the driver for my 2.5G (Dragon RTL8125AG)?

Edited to add NIC name. I could also supply more of my syslog that I omitted if that would be helpful.
Hi, I have the same network card and proxmox 8
Did you solve this problem? Well thank you
 
I'm sad to report that the automated upgrade to opt-in kernel 6.2.16-4-bpo11-pve on Proxmox 7.4 (Debian 11.7) caused this bug to bite ... so it doesn't just affect Proxmox 8.

Also, installing r8168-dkms alone doesn't help, because on Proxmox 7.4 the build fails for the above kernel.

Symptoms are essentially identical to what's posted in this thread: the network dies after the box is up for a few minutes, with the same spew in the logs and on the console as others have pasted. I booted with the prior kernel (6.2.11-2) and so far it is fine. The box was stable with an uptime of ~70 days with prior kernel, no problems like this ever with this box.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!