Opt-in Linux 7.0 Kernel for Proxmox VE 9 available

Blackclaws · May 27, 2026

bossanova808 said:
I'm a relative novice with this stuff, so please excuse this noise if I am wrong, but that ^ smells a lot like: https://forum.proxmox.com/threads/a...-with-kernel-7-and-multiple-nic-types.183574/

Could be related but tailscale is specifically UDP and not TCP so similar problems with offloading but a different datapath. Could just be that the whole off the offloading featureset is somewhat broken in some NIC combinations.

tomten · May 27, 2026

Blackclaws said:
Leaving a link to this here as well so others looking for problems can see it: https://bugzilla.proxmox.com/show_bug.cgi?id=7627

TCP checksum offloading for virtio is broken for at least some NIC types on latest kernel + qemu for later linux guests / windows guests.

I've seen similar issues but with a realtek nic, I just figured it was realtek being realtek and turned off checksum offloading.

Blackclaws · May 27, 2026

tomten said:
I've seen similar issues but with a realtek nic, I just figured it was realtek being realtek and turned off checksum offloading.

See the bug, I can reproduce this on Broadcom and Intel NICs as well so I think its a wider problem. The problem exists when newer linux kernel features in 6.16+ interact with new Qemu versions 11.0+ which introduce new offloading scenarios. I don't know why these break if its a problem in Qemu or the host Kernel though.

chrcoluk · May 27, 2026

Does choosing a lower machine version prevent these issues? 11.0 was only released late april, it seems a bit too new.

Pcom · May 27, 2026

fiona said:
Hi,

with QEMU 10.2, there was a switch to using io_uring for the IO thread event loops and the IO pressure/wait accounting is set via the io_uring subsystem now. It's a different kernel subsystem from before, so it's not unexpected if it's different.

So, if I understood correctly, this is just a different way the graph is calculated after the update, and it does not necessarily mean that performance is affected, right?

p-v-a · May 28, 2026

see here - https://bugzilla.kernel.org/show_bug.cgi?id=220693. in my case I had to regress to 6.14 for FPDMA errors go away.

fiona · May 28, 2026

Pcom said:
So, if I understood correctly, this is just a different way the graph is calculated after the update, and it does not necessarily mean that performance is affected, right?

Yes. To be precise: a different way the IO wait metric is calculated.

kromberg · May 29, 2026

I am trying to install the NVidia host grid drivers on 7.0.2-7-pve and I am getting this error:

fatal error: os-interface.h: No such file or directory

I have these installed:

proxmox-headers-6.17.13-12-pve
proxmox-headers-7.0.2-7-pve

What am I missing?

daanw · May 29, 2026

kromberg said:
I am trying to install the NVidia host grid drivers on 7.0.2-7-pve and I am getting this error:

fatal error: os-interface.h: No such file or directory

I have these installed:

proxmox-headers-6.17.13-12-pve
proxmox-headers-7.0.2-7-pve

What am I missing?

Which driver version? This could indicate incompatibility between driver and kernel version.

kromberg · May 29, 2026

550.144.02 looks like the latest version a P100 and V100 supports.

daanw · May 29, 2026

kromberg said:
550.144.02 looks like the latest version a P100 and V100 supports.

Linux 6.15 or newer has no support for the EXTRA_CFLAGS variable in out-of-tree module Kbuild files, needed for 550.144.02.

You can try to manually replace EXTRA_CFLAGS with ccflags-y in the driver's Kbuild files before installation, but no guarantee:
https://gist.github.com/mrdemonbkgit/9ba6962002829121486a0c6f4e8d4ac3

kromberg · May 30, 2026

Thanks for that, it helped some but still not working. Looks like I will need to reinstall and configure proxmox back to version 8.

BarryS83 · Jun 1, 2026

BarryS83 said:
Is this the correct way to set it to force TSC?

nano /etc/default/grub

and than change the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet clocksource=tsc tsc=reliable"

Is there any risk to set this? Do I risk the host not booting at all?

Forcing my PVE host to use TSC instead of HPET did resolve the issue with the idle power usage. Host has been running now for 58 hours without any issues.

What can be the reason the newer kernels do not except the TSC anymore from certain AMD CPU's? Is there certain UEFI settings we need to set in order for the kernel to properly accept TSC by itself? Previous kernels did not have this issue and I did not need to force TSC, no settings in firmware have been changed in the meantime.

My CPU is a AMD Ryzen 5 5560U.

Steps I have followed to force the kernel to sue TSC:

1) To confirm if your host is using TSC or HPET run command: cat /sys/devices/system/clocksource/clocksource0/current_clocksource
1) Run command: nano /etc/default/grub
2) Change the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet" to: GRUB_CMDLINE_LINUX_DEFAULT="quiet clocksource=tsc tsc=reliable"
3) Ctrl+O then Ctrl+X to update and close the GRUB file.
4) Run command: update-grub
5) Run command: update-initramfs -u -k all
6) Reboot host
7) To confirm host is using TSC now run command: cat /sys/devices/system/clocksource/clocksource0/current_clocksource

edcoppen · Jun 1, 2026

Kernel 7.0.x regression: ACS violation on PCIe port 80:1b.4 causes VM freeze — Arrow Lake + Thunderbolt 5 + Intel Arc B50 passthrough

Hi all, posting to report a clear regression on kernel 7.0.x affecting my Windows 11 VM with PCIe passthrough. Works perfectly on 6.17.13-13-pve, broken on all 7.0.x kernels tested.
Hardware

CPU: Intel Arrow Lake-S
Chipset: Intel 800 Series PCH
GPU (passthrough): Intel Arc Pro B50 (Battlemage G21, 04:00.0)
Thunderbolt: Intel JHL9580 Thunderbolt 5 Barlow Ridge (84:00.0 / 97:00.0)
USB controller (passthrough): ASMedia ASM3242 USB 3.2 (06:00.0)
NVMe (passthrough): Samsung S4LV008 Pascal (81:00.0)

Proxmox version: 9.2.3

Kernel boot parameters: intel_iommu=on iommu=pt split_lock_detect=off

Symptom
The Windows 11 VM freezes shortly after the Arc GPU driver initialises on boot. Display output is lost, CPU pegs at 100%, and the VM becomes completely unresponsive. The host immediately enters a continuous AER error loop on PCIe root port 0000:80:1b.4 which persists even after force-stopping the VM and requires a full host reboot to clear. Keyboard and mouse (connected via the passed-through ASMedia USB controller) also stop responding at the point of freeze.

Root cause
The AER loop is triggered by an ACS violation on PCIe root port 80:1b.4 (device 8086:7f44), which is the upstream port for the Thunderbolt 5 subsystem. The violation appears to be triggered by DMA activity when the Arc GPU driver initialises, crossing an ACS boundary between sibling root ports 80:1b.0 and 80:1b.4.

dmesg (kernel 7.0.6-2-pve)

Code:

[  436.209061] pcieport 0000:80:1b.4: AER: Correctable error message received from 0000:80:1b.4
[  436.209134] pcieport 0000:80:1b.4:   device [8086:7f44] error status/mask=00300000/00000000
[  436.209138] pcieport 0000:80:1b.4:    [20] UnsupReq
[  436.209140] pcieport 0000:80:1b.4:    [21] ACSViol (First)
[  437.238805] thunderbolt 0000:84:00.0: AER: can't recover (no error_detected callback)
[  437.238815] xhci_hcd 0000:97:00.0: AER: can't recover (no error_detected callback)
[  437.238832] pcieport 0000:80:1b.4: AER: device recovery failed
... (repeats continuously until host reboot)

Kernels tested

6.17.13-13-pve — VM boots and runs normally
7.0.2-6-pve — freeze, ACS violation loop
7.0.2-7-pve — freeze, ACS violation loop
7.0.6-2-pve — freeze, ACS violation loop

Workaround
Pinned to 6.17.13-13-pve which resolves the issue completely.

Happy to provide any additional diagnostic output if helpful.

fabian · Jun 2, 2026

could you open a new thread and post the data you posted here + "lspci -v"?

romek92 · Jun 12, 2026

hardwareadictos said:
I can confirm similar behavior on a 4-node cluster running Proxmox VE 9.2.2.

Cluster hardware:

2x Intel Xeon E3-1220L v2

1x AMD EPYC 7551P

1x AMD EPYC 3251

Only the EPYC 3251 node is affected.

Symptoms:

Progressive performance degradation after ~2 days uptime on kernel 7.0.2-6-pve

CPU usage gradually rises until the host reaches nearly 100% system CPU usage

High load average with almost no IO wait

All KVM guests are affected equally

Host becomes nearly unusable

Important observations:

Current clocksource is already tsc

read_hpet usage is minimal (~1%)

RAM, swap and IO usage remain normal

The issue appears related to virtualization syscalls / context switching / scheduler activity

powertop shows very high tick_nohz_handler, sched(softirq) and APIC timer activity

dbs_work_handler activity is also unusually high

Additional notes:

The EPYC 7551P node running the same Proxmox/kernel version does NOT show the issue

Changing CPU governor from ondemand to performance did not solve the problem

Issue seems specific to the EPYC 3251 embedded platform

Downgrading back to 6.8.12-15-pve restores normal behavior.

View attachment 97974

Do you have any new updates or solutions other than reverting to the previous kernel version?

tjh · Jun 14, 2026

EDIT: Please ignore previous message if you saw it. It's my fault, i'm doing this

Code:

echo scan-time > /sys/kernel/mm/ksm/advisor_mode

which was the cause

KevinK · Jun 23, 2026

Hit a hard regression on 7.0.12-1-pve on a Ryzen 9 5950X box - after a routine reboot, no VM or container would start. `qm start` returned without error but guests stayed `stopped`, and load sat at ~8 with nothing actually running.

Root cause is a WARNING storm in the execmem cache during module loading. The first hit is on the NFS/sunrpc modules at boot, then it repeats on the firewall modules (nf_tables/ip_set/iptable_filter), leaving each modprobe stuck in uninterruptible D-state. That cascades into:

- pve-firewall: can't lock file '/run/lock/pvefw.lck' - got timeout
- ha-manager status: lrm <node> (old timestamp - dead?), all HA services in `freeze`
- net result: nothing will start

The WARNING fired 85,535 times on that single boot. The kernel was already tainted D (DIE) / W (WARN).

Code:

CPA: called for zero pte. vaddr = ffffffffc1e4c000 cpa->vaddr = ffffffffc1e4c000
WARNING: arch/x86/mm/pat/set_memory.c:1821 at __cpa_process_fault+0x6a4/0x6f0, CPU#16: modprobe/1249
CPU: 16 PID: 1249 Comm: modprobe Tainted: P      D    O        7.0.12-1-pve #1
Hardware name: ASUS ROG CROSSHAIR VIII FORMULA, BIOS 5002 01/13/2025
RIP: 0010:__cpa_process_fault+0x6ae/0x6f0
Call Trace:
 __change_page_attr_set_clr+0xaca/0x1000
 change_page_attr_set_clr+0x106/0x1b0
 set_memory_nx+0x4e/0x70
 execmem_alloc_rw+0x31/0x70
 load_module+0x7a1/0x2150
 init_module_from_file+0xfd/0x160
 idempotent_init_module+0x110/0x300
 __x64_sys_finit_module+0x73/0xf0
 do_syscall_64+0x10b/0x14e0

Environment
- proxmox-ve: 9.2.0, pve-manager: 9.2.3, kernel 7.0.12-1-pve
- pve-firewall: 6.0.4, pve-ha-manager: 5.2.4, qemu-server: 9.1.17, zfs 2.4.2-pve1
- AMD Ryzen 9 5950X, ASUS ROG Crosshair VIII Formula, BIOS 5002
- Repo: pve-no-subscription

Workaround
Pinned a 6.17 kernel and rebooted. A clean `systemctl reboot` hangs on the D-state tasks, so I used SysRq (s, u, b) - all guests were already down, so it was safe:

Code:

proxmox-boot-tool kernel pin 6.17.13-13-pve

6.17.13-13-pve boots clean and every guest starts normally.

The trace lines up with the known upstream execmem cache rework in x86/module (WARNINGs in arch/x86/mm/pat/set_memory.c). Is 7.0.12-1-pve expected to carry an execmem fix, or should it be held back for now? Happy to file on Bugzilla and attach the full `journalctl -b` if useful.

fabian · Jun 23, 2026

KevinK said:
Hit a hard regression on 7.0.12-1-pve on a Ryzen 9 5950X box - after a routine reboot, no VM or container would start. `qm start` returned without error but guests stayed `stopped`, and load sat at ~8 with nothing actually running.

Root cause is a WARNING storm in the execmem cache during module loading. The first hit is on the NFS/sunrpc modules at boot, then it repeats on the firewall modules (nf_tables/ip_set/iptable_filter), leaving each modprobe stuck in uninterruptible D-state. That cascades into:

- pve-firewall: can't lock file '/run/lock/pvefw.lck' - got timeout
- ha-manager status: lrm <node> (old timestamp - dead?), all HA services in `freeze`
- net result: nothing will start

The WARNING fired 85,535 times on that single boot. The kernel was already tainted D (DIE) / W (WARN).

Code:

CPA: called for zero pte. vaddr = ffffffffc1e4c000 cpa->vaddr = ffffffffc1e4c000 WARNING: arch/x86/mm/pat/set_memory.c:1821 at __cpa_process_fault+0x6a4/0x6f0, CPU#16: modprobe/1249 CPU: 16 PID: 1249 Comm: modprobe Tainted: P D O 7.0.12-1-pve #1 Hardware name: ASUS ROG CROSSHAIR VIII FORMULA, BIOS 5002 01/13/2025 RIP: 0010:__cpa_process_fault+0x6ae/0x6f0 Call Trace: __change_page_attr_set_clr+0xaca/0x1000 change_page_attr_set_clr+0x106/0x1b0 set_memory_nx+0x4e/0x70 execmem_alloc_rw+0x31/0x70 load_module+0x7a1/0x2150 init_module_from_file+0xfd/0x160 idempotent_init_module+0x110/0x300 __x64_sys_finit_module+0x73/0xf0 do_syscall_64+0x10b/0x14e0

Environment
- proxmox-ve: 9.2.0, pve-manager: 9.2.3, kernel 7.0.12-1-pve
- pve-firewall: 6.0.4, pve-ha-manager: 5.2.4, qemu-server: 9.1.17, zfs 2.4.2-pve1
- AMD Ryzen 9 5950X, ASUS ROG Crosshair VIII Formula, BIOS 5002
- Repo: pve-no-subscription

Workaround
Pinned a 6.17 kernel and rebooted. A clean `systemctl reboot` hangs on the D-state tasks, so I used SysRq (s, u, b) - all guests were already down, so it was safe:

Code:

proxmox-boot-tool kernel pin 6.17.13-13-pve

6.17.13-13-pve boots clean and every guest starts normally.

The trace lines up with the known upstream execmem cache rework in x86/module (WARNINGs in arch/x86/mm/pat/set_memory.c). Is 7.0.12-1-pve expected to carry an execmem fix, or should it be held back for now? Happy to file on Bugzilla and attach the full `journalctl -b` if useful.

please post a full boot journal in a new thread! does this only affect 7.0.12, or also earlier 7.0 kernels?

uzumo · Jul 3, 2026

As I mentioned in the thread I created, I think this is a kernel bug, so I'm posting it here as well.

I think this is a bug in kernel 7.0.14. Will it be fixed?

U

[SOLVED] Thread 'kernel 7.0.14-2, PCIe pass-through for amdgpu does not work properly.'

Jul 3, 2026

After applying kernel 7.0.14-2, the following error occurs when I start a virtual machine with an RX9070 XT connected via PCIe passthrough.

When this happens, the virtual machine task itself does not complete and hangs.

I have confirmed that the system operates normally when I pin the kernel to version 7.0.12-1.

I would like to resolve this issue. Could you please provide any advice on how to investigate this or suggest a solution?

When a virtual machine with GPU passthrough is running, I display the virtual machine's screen on monitor, and when it is stopped, I display the console...

Opt-in Linux 7.0 Kernel for Proxmox VE 9 available

Member

Member

Member

Renowned Member

Member

Active Member

Proxmox Staff Member

Active Member

Active Member

Active Member

Active Member

Active Member

Member

New Member

Proxmox Staff Member

New Member

Renowned Member

Member

Proxmox Staff Member

Well-Known Member

[SOLVED] Thread 'kernel 7.0.14-2, PCIe pass-through for amdgpu does not work properly.'

We value your privacy