After upgrading, kernel 6.5.11-4-pve does not boot.

rec0veryyy

New Member
Oct 27, 2022
9
0
1
Basically I updated and now with the new kernel 6.5.11.4-pve does not boot, but if from the grub menu manually select the previous 6. 2 if it boots, I have seen that it has happened to more people, I have a mini pc t8plus with an intel n100 and has dual ethernet realtek 8169, 1 month ago I followed this guide
Code:
https://gist.github.com/SQLJames/fe6fcd5e819d864986ce2eff6ad350da
to blacklist the r8169 and use the r8168 as this realtek to 2 or 3 days of operation stopped giving connection, then I followed that guide and I have had running for 1 month the server without any cut, the issue where I want to get with all this is that if it has something to do what I did this guide for me not to start the kernel 6. 5, for what it's worth my /etc/default/grub contains:

Bash:
GRUB_CMDLINE_LINUX="r8168.aspm=0 r8168.eee_enable=0 pcie_aspm=off loglevel=3"

I hope someone who knows about this can please help me, thank you all very much.

PD: also run this command so I can at least boot kernel 6.2 by default:

Code:
proxmox-boot-tool kernel pin 6.2.16-19-pve

--> Package versions

Boot Mode in GUI: EFI

Code:
proxmox-ve: not correctly installed (running kernel: 6.2.16-19-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.0.9
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.4
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.9
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-1
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.2
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve3
 
I got the exact same problem here today. I upgraded one of the two nodes (just to be sure only one at a time :-)).
After booting, the display didn't show anything and I could only poweroff the machine.
I restarted the machine with the old kernel and also pinned it to be sure.

I also use the dkms-package for the Realtek NIC in my machine.

During upgrade I saw a message about compiling this dkms-driver, it said to look in the /var/lib/dkms/r8168/8.051.02/build/make.log

Here I saw these messages:

Code:
DKMS make.log for r8168-8.051.02 for kernel 6.5.11-4-pve (x86_64)
Sat Nov 25 11:32:57 CET 2023
make: Entering directory '/usr/src/linux-headers-6.5.11-4-pve'
  CC [M]  /var/lib/dkms/r8168/8.051.02/build/r8168_n.o
  CC [M]  /var/lib/dkms/r8168/8.051.02/build/r8168_asf.o
  CC [M]  /var/lib/dkms/r8168/8.051.02/build/rtl_eeprom.o
  CC [M]  /var/lib/dkms/r8168/8.051.02/build/rtltool.o
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c: In function ‘r8168_csum_workaround’:
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c:29208:24: error: implicit declaration of function ‘skb_gso_segment’; did you mean ‘skb_gso_reset’? [-Werror=implicit-function-declaration]
29208 |                 segs = skb_gso_segment(skb, features);
      |                        ^~~~~~~~~~~~~~~
      |                        skb_gso_reset
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c:29208:22: warning: assignment to ‘struct sk_buff *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
29208 |                 segs = skb_gso_segment(skb, features);
      |                      ^
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:251: /var/lib/dkms/r8168/8.051.02/build/r8168_n.o] Error 1
make[1]: *** [/usr/src/linux-headers-6.5.11-4-pve/Makefile:2039: /var/lib/dkms/r8168/8.051.02/build] Error 2
make: *** [Makefile:234: __sub-make] Error 2
make: Leaving directory '/usr/src/linux-headers-6.5.11-4-pve'

As you can see, it failed on compiling the drivers voor the new kernel. I also think this is the issue for you.
It seems the dkms-package doesn't have a version yet for the new kernel.
 
I think I resolved the problem after reading some messages on this forum.

The dkms package doesn't have the support for this new Linux kernel. I removed the r8168-dkms package and now use the kernel-delivered r8168 driver again (I read the problems with this driver are gone in the new kernel).

So this is what I did:

I installed a software package which was mentioned missing during the upgrade:
Bash:
apt install grub-efi-amd64

Then I unpinned the kernel (with pinning I pinned it to the last working kernel-release):
Bash:
proxmox-boot-tool kernel unpin

And after that (that was a little scary) I removed the dkms-package:
Bash:
apt remove r8168-dkms

After this reboot the machine and it came back running Proxmox 8.1 and Linux kernel 6.5

This now prevents me from returning to the old kernel. If this keeps running stable (without the continuos reboots I had when running Proxmox 8.0 and Linux kernel 6.2), I think the problem is fixed

At least the node is now rebooted and runs the new software.
Fingers crossed for now!

I hope this helps you and others to a solution for their problems.
 
Last edited:
I think I resolved the problem after reading some messages on this forum.

The dkms package doesn't have the support for this new Linux kernel. I removed the r8168-dkms package and now use the kernel-delivered r8168 driver again (I read the problems with this driver are gone in the new kernel).

So this is what I did:

I installed a software package which was mentioned missing during the upgrade:
Bash:
apt install grub-efi-amd64

Then I unpinned the kernel (with pinning I pinned it to the last working kernel-release):
Bash:
proxmox-boot-tool kernel

And after that (that was a little scary) I removed the dkms-package:
Bash:
apt remove r8168-dkms

After this reboot the machine and it came back running Proxmox 8.1 and Linux kernel 6.5

This now prevents me from returning to the old kernel. If this keeps running stable (without the continuos reboots I had when running Proxmox 8.0 and Linux kernel 6.2), I think the problem is fixed

At least the node is now rebooted and runs the new software.
Fingers crossed for now!

I hope this helps you and others to a solution for their problems.

It worked! you are very great but now I have another problem, it turns out that if I load the kernel 6.5 I have no internet access, I ping google.es and nothing and I can not access the interface because there is no connection, instead if I load from grub the 6.2 I have internet and I can access the interface

my mini pc has 2 ethernet ports, with kernel 6.5 in the current port (the 1) that works does not light up but in the second if, and with kernel 6.2 lights both ports, I could do? you can think of something?
 
It worked! you are very great but now I have another problem, it turns out that if I load the kernel 6.5 I have no internet access, I ping google.es and nothing and I can not access the interface because there is no connection, instead if I load from grub the 6.2 I have internet and I can access the interface

my mini pc has 2 ethernet ports, with kernel 6.5 in the current port (the 1) that works does not light up but in the second if, and with kernel 6.2 lights both ports, I could do? you can think of something?

I think the driver for this card isn't loaded or doesn't function right.
Can you see if the card is functioning or active within the Proxmox configuration?

If you can only see it in the configuration when you boot with 6.2 and not in 6.5, then it's definitely something with the driver. Or maybe the "auto start" option is not set?
 
Same here, Got en HP ProDesk 600 with an Realtek NIC, tryed the above... only with an older Kernel everything works ok,
Clean install also won't fix the problem driver is by default not supported anymore...
 
In my system have network devices which use the r8169 modules. Maybe you're Realtek NIC's need another driver?

Code:
lspci -k

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
    Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
    Kernel driver in use: r8169
    Kernel modules: r8169
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)
    Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
    Kernel driver in use: r8169
    Kernel modules: r8169
 
So this is what I did:

I installed a software package which was mentioned missing during the upgrade:
Bash:
apt install grub-efi-amd64

Then I unpinned the kernel (with pinning I pinned it to the last working kernel-release):
Bash:
proxmox-boot-tool kernel

And after that (that was a little scary) I removed the dkms-package:
Bash:
apt remove r8168-dkms
I have the same problem.

Here is my thread in German:
https://forum.proxmox.com/threads/nach-klick-auf-_-upgrade-bootet-proxmox-nicht-mehr.137084/

I did the following:

Code:
apt install grub-efi-amd64
# Kernel 6.2.16-19-pve unpin
proxmox-boot-tool kernel unpin
apt remove r8168-dkms

Now the network driver is missing in kernel 6.5.11-4-pve.
How can I reactivate the new r8169 driver?
 
I had the r8169 module blacklisted before in /etc/modprobe.d/r8168-dkms.conf

Now I commented out everything concerning this driver and it looks like this:

Code:
# settings for r8168-dkms

# map the specific PCI IDs instead of blacklisting the whole r8169 module
#alias    pci:v00001186d00004300sv00001186sd00004B10bc*sc*i*    r8168
#alias    pci:v000010ECd00008168sv*sd*bc*sc*i*            r8168

# if the aliases above do not work, uncomment the following line
# to blacklist the whole r8169 module
#blacklist r8169

So look for things concering r8169 and r8168 and deactivate everything for this. This worked for me, now my Realtek NICs use the new r8169 driver and it still works (after more than 12 hours!)
 
  • Like
Reactions: matt69
Thank you as well!
I also had to comment out the /etc/modprobe.d/r8168-dkms.conf file, but I had another file called /etc/modprobe.d/blacklist-r8169.conf that contained one line which I also commented out. It now looks like:
Code:
#blacklist r8169
Until I did this I had no network.
 
Thank you as well!
I also had to comment out the /etc/modprobe.d/r8168-dkms.conf file, but I had another file called /etc/modprobe.d/blacklist-r8169.conf that contained one line which I also commented out. It now looks like:
Code:
#blacklist r8169
Until I did this I had no network.
Yes with this in use, you specifically blacklist the "default" r8169 driver which you want to use.
So a good find and good to hear that it also works for you now!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!