Upgrading from 7 to 8 Kernel issue

azakaria

New Member
Feb 19, 2024
2
0
1
Hi, I have a cluster of 4 nodes. They have Debian 11 (kernel 5.15.143-1) and PVE 7.4-17.
I decided to upgrade to Debian 12 and PVE 8.1 using the official document https://pve.proxmox.com/wiki/Upgrade_from_7_to_8

I followed the steps but the system didn't boot with the Kernel 6.5


Image (4) (Medium).jpg

So I booted with an older kernel (5.15.143-1-pve), and I was able to login. when I dpkg again, I got the following:
Code:
dpkg --configure -a
Setting up proxmox-kernel-6.5.13-5-pve-signed (6.5.13-5) ...
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/dkms 6.5.13-5-pve /boot/vmlinuz-6.5.13-5-pve
dkms: running auto installation service for kernel 6.5.13-5-pve.
Sign command: /lib/modules/6.5.13-5-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Building module:
Cleaning build area...(bad exit status: 2)
make -j32 KERNELRELEASE=6.5.13-5-pve all KPVER=6.5.13-5-pve....(bad exit status: 2)
Error! Bad return status for module build on kernel: 6.5.13-5-pve (x86_64)
Consult /var/lib/dkms/kernel-mft-dkms/4.12.0/build/make.log for more information.
Error! One or more modules failed to install during autoinstall.
Refer to previous errors for more information.
dkms: autoinstall for kernel: 6.5.13-5-pve failed!
run-parts: /etc/kernel/postinst.d/dkms exited with return code 11
Failed to process /etc/kernel/postinst.d at /var/lib/dpkg/info/proxmox-kernel-6.5.13-5-pve-signed.postinst line 20.
dpkg: error processing package proxmox-kernel-6.5.13-5-pve-signed (--configure):
 installed proxmox-kernel-6.5.13-5-pve-signed package post-installation script subprocess returned error exit status 2
dpkg: dependency problems prevent configuration of proxmox-kernel-6.5:
 proxmox-kernel-6.5 depends on proxmox-kernel-6.5.13-5-pve-signed | proxmox-kernel-6.5.13-5-pve; however:
  Package proxmox-kernel-6.5.13-5-pve-signed is not configured yet.
  Package proxmox-kernel-6.5.13-5-pve is not installed.
  Package proxmox-kernel-6.5.13-5-pve-signed which provides proxmox-kernel-6.5.13-5-pve is not configured yet.

dpkg: error processing package proxmox-kernel-6.5 (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-default-kernel:
 proxmox-default-kernel depends on proxmox-kernel-6.5; however:
  Package proxmox-kernel-6.5 is not configured yet.

dpkg: error processing package proxmox-default-kernel (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on proxmox-default-kernel; however:
  Package proxmox-default-kernel is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 proxmox-kernel-6.5.13-5-pve-signed
 proxmox-kernel-6.5
 proxmox-default-kernel
 proxmox-ve

this is the output of the file /var/lib/dkms/kernel-mft-dkms/4.12.0/build/make.log

Code:
DKMS make.log for kernel-mft-dkms-4.12.0 for kernel 6.5.13-5-pve (x86_64)
Mon Apr 22 05:32:29 PM EDT 2024
/bin/sh: 1: Syntax error: Unterminated quoted string
/bin/sh: 1: [: -lt: unexpected operator
make -C /lib/modules/6.5.13-5-pve/build M=/var/lib/dkms/kernel-mft-dkms/4.12.0/build CONFIG_CTF= CONFIG_CC_STACKPROTECTOR_STRONG=  modules
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
make[1]: Entering directory '/usr/src/linux-headers-6.5.13-5-pve'
/bin/sh: 1: Syntax error: Unterminated quoted string
/bin/sh: 1: [: -lt: unexpected operator
  CC [M]  /var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pci.o
  CC [M]  /var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.o
/var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.c: In function ‘close_dma’:
/var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.c:587:13: error: implicit declaration of function ‘pci_unmap_single’; did you mean ‘dma_unmap_single’?
 [-Werror=implicit-function-declaration]
  587 |             pci_unmap_single(dev->pci_dev, dev->dma_props[i].dma_map, DMA_MBOX_SIZE, DMA_BIDIRECTIONAL);
      |             ^~~~~~~~~~~~~~~~
      |             dma_unmap_single
/var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.c: In function ‘ioctl.isra’:
/var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.c:1034:1: warning: the frame size of 1136 bytes is larger than 1024 bytes [-Wframe-larger-than=]
 1034 | }
      | ^
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:251: /var/lib/dkms/kernel-mft-dkms/4.12.0/build/mst_pciconf.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.5.13-5-pve/Makefile:2039: /var/lib/dkms/kernel-mft-dkms/4.12.0/build] Error 2
make[1]: *** [Makefile:234: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.5.13-5-pve'
make: *** [Makefile:68: all] Error 2

pve7to8 output:

Code:
pve7to8 --full
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
PASS: all packages up-to-date

Checking proxmox-ve package version..
PASS: already upgraded to Proxmox VE 8

Checking running kernel version..
WARN: unexpected running and installed kernel '5.15.143-1-pve'.

= CHECKING CLUSTER HEALTH/SETTINGS =

PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.

Analzying quorum settings and state..
INFO: configured votes - nodes: 4
INFO: configured votes - qdevice: 0
INFO: current expected votes: 4
INFO: current total votes: 4

Checking nodelist entries..
PASS: nodelist settings OK

Checking totem settings..
PASS: totem settings OK

INFO: run 'pvecm status' to get detailed cluster status..

= CHECKING HYPER-CONVERGED CEPH STATUS =

INFO: hyper-converged ceph setup detected!
INFO: getting Ceph status/health information..
PASS: Ceph health reported as 'HEALTH_OK'.
INFO: checking local Ceph version..
PASS: found expected Ceph 17 Quincy release.
INFO: getting Ceph daemon versions..
PASS: single running version detected for daemon type monitor.
INFO: different builds of same version detected for an monitor. Are you in the middle of the upgrade?
PASS: single running version detected for daemon type manager.
INFO: different builds of same version detected for an manager. Are you in the middle of the upgrade?
SKIP: unable to determine versions of running Ceph MDS instances.
PASS: single running version detected for daemon type OSD.
INFO: different builds of same version detected for an OSD. Are you in the middle of the upgrade?
WARN: 'noout' flag not set - recommended to prevent rebalancing during upgrades.
INFO: checking Ceph config..

= CHECKING CONFIGURED STORAGES =

PASS: storage 'backup2' enabled and active.
PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.
PASS: storage 'uwcs-ceph-lxc' enabled and active.
PASS: storage 'uwcs-ceph-vm' enabled and active.
INFO: Checking storage content type configuration..
PASS: no storage content problems found
PASS: no storage re-uses a directory for multiple content types.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvescheduler.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for supported & active NTP service..
PASS: Detected active time synchronisation unit 'chrony.service'
INFO: Checking for running guests..
PASS: no running guest detected.
INFO: Checking if the local node's hostname 'vmserver1' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '137.207.82.55' configured and active on single interface.
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters (and newer) security level for TLS connections (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters (and newer) security level for TLS connections (2048 >= 2048)
INFO: Checking backup retention settings..
PASS: no backup retention problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking permission system changes..
INFO: Checking custom role IDs for clashes with new 'PVE' namespace..
PASS: no custom roles defined, so no clash with 'PVE' role ID namespace enforced in Proxmox VE 8
INFO: Checking if LXCFS is running with FUSE3 library, if already upgraded..
PASS: systems seems to be upgraded and LXCFS is running with FUSE 3 library
INFO: Checking node and guest description/note length..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking if the suite for the Debian security repository is correct..
PASS: found no suite mismatch
INFO: Checking for existence of NVIDIA vGPU Manager..
PASS: No NVIDIA vGPU Service found.
INFO: Checking bootloader configuration...
SKIP: System booted in legacy-mode - no need for additional packages
INFO: Check for dkms modules...
/sbin/dkms: line 2497: echo: write error: Broken pipe
WARN: dkms modules found, this might cause issues during upgrade.
/dev/rbd0
/dev/rbd1
/dev/rbd2
WARN: Found at least one CT (76156) which does not support running in a unified cgroup v2 layout
    Consider upgrading the Containers distro or set systemd.unified_cgroup_hierarchy=0 in the Proxmox VE hosts' kernel cmdline! Skipping further CT compat checks.

= SUMMARY =

TOTAL:    43
PASSED:   37
SKIPPED:  2
WARNINGS: 4
FAILURES: 0

ATTENTION: Please check the output for detailed information!

Code:
# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
5.15.143-1-pve
5.15.149-1-pve
6.5.13-5-pve

Code:
# apt list pve-kernel-* --installed
Listing... Done
pve-kernel-5.15.143-1-pve/now 5.15.143-1 amd64 [installed,local]
pve-kernel-5.15.149-1-pve/now 5.15.149-1 amd64 [installed,local]
pve-kernel-5.15/now 7.4-12 all [installed,local]

I have Mellanox card:

Code:
04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
        Subsystem: Mellanox Technologies Mellanox Technologies ConnectX-4 Lx Stand-up single-port 40GbE MCX4131A-BCAT
        Flags: bus master, fast devsel, latency 0, IRQ 41, NUMA node 0
        Memory at fa000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at feb00000 [disabled] [size=1M]
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [40] Power Management version 3
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [1c0] Secondary PCI Express
        Kernel driver in use: mlx5_core
        Kernel modules: mlx5_core

I was reading this, this, and this articles, I ended up with one command to do which is apt purge kernel-mft-dkms but I'm scared it may ruin something else?!

Code:
# dkms status
kernel-mft-dkms/4.12.0, 5.15.143-1-pve, x86_64: installed
kernel-mft-dkms/4.12.0, 5.15.149-1-pve, x86_64: installed

I appreciate any help. Thank you!
 
Last edited:
I decided to run the command apt purge kernel-mft-dkms and it completed the configuration. All is good now.
I tested the Mallonox connections and it's working fine.

1713828073599.png

I hope this will help someone else :)