Opt-in Linux 6.17 Kernel for Proxmox VE 9 available on test & no-subscription

The latest kernel update is very unstable on my machine(s) (see attached journalctl.log). The older kernel 6.14.11-4, and 6.14.8-2 don't have this issue.

1764057674070.png
 

Attachments

The latest kernel update is very unstable on my machine(s) (see attached journalctl.log). The older kernel 6.14.11-4, and 6.14.8-2 don't have this issue.

View attachment 93239
Please open a new thread for this and post more details about your HW, e.g. the server vendor and model.

This might be related to your CPU not implementing VT-d correctly, and, e.g., the newer kernel exposing some feature that was previously not used or just not detected that it was wrong. If you do not rely on PCI pass-through you can try adding disabling the IOMMU, e.g by adding the intel_iommu=off as kernel parameter.
 
  • Like
Reactions: sharanah
Today I upgraded 2 nodes to 9.1-6.17 of my 3 node cluster and both failed to boot with the 6.17 kernel into a bootloop.
  • It boots into grub and after selecting the 6.17 kernel the machine just reboots
  • For some reason `systemd-boot-efi` seems to be installed, but also seems to be on my last 6.14 node left.
  • Both servers are identical Dell R630 (see fastfetch below)
  • They are booted in BIOS.
  • Systemd-boot is NOT installed.
  • Using ZFS AND Ceph
  • All 3 nodes have already been upgraded from Proxmox 8 if that is relevant.
  • Myself would exclude any bios/efi issues, since it boots into grub.
  • Once i select the 6.17 kernel in grub using the iDRAC remote viewer i get a blackscreen with an "_" in the character location (0,0) and then it reboots. I struggle giving any more clues.

I therefore booted the 6.14 kernel again
root@triton:~# fastfetch
.://:` `://:. root@triton
`hMMMMMMd/ /dMMMMMMh` -----------
`sMMMMMMMd: :mMMMMMMMs` OS: Proxmox VE 9.1.1 x86_64
`-/+oo+/:`.yMMMMMMMh- -hMMMMMMMy.`:/+oo+/-` Host: PowerEdge R630
`:oooooooo/`-hMMMMMMMyyMMMMMMMh-`/oooooooo:` Kernel: Linux 6.14.11-4-pve
`/oooooooo:`:mMMMMMMMMMMMMm:`:oooooooo/` Uptime: 22 mins
./ooooooo+- +NMMMMMMMMN+ -+ooooooo/. Packages: 890 (dpkg)
.+ooooooo+-`oNMMMMNo`-+ooooooo+. Shell: bash 5.2.37
-+ooooooo/.`sMMs`./ooooooo+- Display (VGA-1): 1024x768 @ 60 Hz
:oooooooo/`..`/oooooooo: Terminal: /dev/pts/0
:oooooooo/`..`/oooooooo: CPU: Intel(R) Xeon(R) E5-2643 v3 (12) @ 3.70 GHz
-+ooooooo/.`sMMs`./ooooooo+- GPU: Matrox Electronics Systems Ltd. G200eR2
.+ooooooo+-`oNMMMMNo`-+ooooooo+. Memory: 3.86 GiB / 31.25 GiB (12%)
./ooooooo+- +NMMMMMMMMN+ -+ooooooo/. Swap: 0 B / 8.00 GiB (0%)
`/oooooooo:`:mMMMMMMMMMMMMm:`:oooooooo/` Disk (/): 96.65 GiB / 641.23 GiB (15%) - zfs
`:oooooooo/`-hMMMMMMMyyMMMMMMMh-`/oooooooo:` Disk (/rpool): 128.00 KiB / 544.58 GiB (0%) - zfs
`-/+oo+/:`.yMMMMMMMh- -hMMMMMMMy.`:/+oo+/-` Local IP (vmbr0): 192.168.180.100/24
`sMMMMMMMm: :dMMMMMMMs` Locale: en_US.UTF-8
`hMMMMMMd/ /dMMMMMMh`
`://:` `://:`
 
  • Like
Reactions: dj-bauer
I'm also running Podman in LXC and ran into this issue and can't use 6.17 therefore. It's good that there's a fix but you didn't report this anywhere but here yet @jaminmc?
Yes!!! I am not crazy, or the only one that has had this happen to! I have a LXC container that I have to compile the kernel with my patch in it. Here is how to do it. Create a Debian 13 container, and then paste the steps into it,
Bash:
# 1. Update base system
apt update && apt upgrade -y

# 2. Add Proxmox repo and key
wget -q https://enterprise.proxmox.com/debian/proxmox-release-trixie.gpg \
    -O /etc/apt/trusted.gpg.d/proxmox-release-trixie.gpg

cat > /etc/apt/sources.list.d/pve-src.sources <<EOF
Types: deb
URIs: http://download.proxmox.com/debian/pve
Suites: trixie
Components: pve-no-subscription
Signed-By: /etc/apt/trusted.gpg.d/proxmox-release-trixie.gpg
EOF

# 3. Append Debian deb-src if missing
if ! grep -q "deb-src" /etc/apt/sources.list.d/debian.sources 2>/dev/null; then
    cat >> /etc/apt/sources.list.d/debian.sources <<EOF

Types: deb-src
URIs: http://deb.debian.org/debian
Suites: trixie trixie-updates
Components: main contrib non-free non-free-firmware
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

Types: deb-src
URIs: http://security.debian.org/debian-security
Suites: trixie-security
Components: main contrib non-free non-free-firmware
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg
EOF
fi

# 4. Update
apt update

# 5. Install build tools
apt install -y build-essential git git-email debhelper devscripts fakeroot \
    libncurses-dev bison flex libssl-dev libelf-dev bc cpio kmod pahole dwarves \
    rsync python3 python3-pip pve-doc-generator python-is-python3 dh-python \
    sphinx-common quilt libtraceevent-dev libunwind-dev libzstd-dev pkg-config equivs

# 6. Clone and prepare repo
git clone https://git.proxmox.com/git/pve-kernel.git
cd pve-kernel
git checkout master  # Latest kernel + patches

# 7. Prep and deps
make distclean
make build-dir-fresh

cd proxmox-kernel-*/ ; mk-build-deps -i -r -t "apt-get -o Debug::pkgProblemResolver=yes --no-install-recommends -y" debian/control ; cd ..

# 8. Add patch
cat >> patches/kernel/0014-apparmor-fix-NULL-pointer-dereference-in-aa_file.patch <<'EOF'
diff --git a/security/apparmor/file.c b/security/apparmor/file.c
--- a/security/apparmor/file.c
+++ b/security/apparmor/file.c
@@ -777,6 +777,9 @@ static bool __unix_needs_revalidation(struct file *file, struct aa_label *label
         return false;
     if (request & NET_PEER_MASK)
         return false;
+    /* sock and sock->sk can be NULL for sockets being set up or torn down */
+    if (!sock || !sock->sk)
+        return false;
     if (sock->sk->sk_family == PF_UNIX) {
         struct aa_sk_ctx *ctx = aa_sock(sock->sk);
EOF
make build-dir-fresh

# 9. Build
make

echo "=== BUILD COMPLETE ==="
ls -lh *.deb 


# For updates,
git reset --hard HEAD
git clean -df
git pull
git submodule update --init --recursive
#  Then do Step 8 & 9
After it is all built, on my proxmox since the container is on ZFS, I just run this on my proxmox to update the kernel with the one I compiled:

apt --reinstall install /rpool/data/subvol-114-disk-0/root/pve-kernel/proxmox-{kernel,headers}-6.17.2-1-pve_6.17.2-1*.deb

replace 114 with your container number. That will reinstall the kernel with the patched one. Or scp the created deb files to your proxmox servers, and then install them from there.

Check https://git.proxmox.com/?p=pve-kernel.git;a=summary for kernel updates
 
Last edited: