Opt-in Linux 7.0 Kernel for Proxmox VE 9 available

Ok, I narrowed my problems down to either the 16.4 nvidia enterprise host driver and / or the kernel option "iommu=pt".
After 6.17 showed the same failures I tracked my last actions on the server which were the installation of the nvidia driver and the removal of the kernel commandline option iommu.
Since I uninstalled the driver and added the option back to grubs kernel command line, the whole system is rockstable under 7.0.2-4 without any suspicious syslog messages.

Next I investigate whether it was the driver or the option or both.
Sorry for messing the thread...

//Edith:
It was definitely the kernel option.
Installed the driver again with "iommu=pt" and all is fine.
 
Last edited:
as response to myself with the higher power consumtion. Fixed it by installing the latest revision and a couple of reboots. No idea what cause it, but one core stayed stuck on Max Freq. Now its back to normal. (7.0.2-4-pve)
 
@Weltherrscher I found an upstream patch that should resolve the issue and pinged the maintainers about it. It's a bit strange that nobody else is running into this issue, but it might depend on the specific workload pattern/guest kernel (the patch was made after the issue was found via fuzzing).
 
Last edited:
Thank you for your reply! =)

Anyway, i keep hitting DMAR errors for the X540 VFs even with iommu=pt in Kernel 7.0.2-4:
Code:
May 19 22:44:52 px1 QEMU[3776]: kvm: vfio_container_dma_map(0x595d88370dd0, 0xe10bb000, 0x1000, 0x7834682db000) = -28 (No space left on device)
May 19 22:44:52 px1 kernel: DMAR: DRHD: handling fault status reg 2
May 19 22:44:52 px1 kernel: DMAR: [DMA Read NO_PASID] Request device [04:10.0] fault addr 0xe2162000 [fault reason 0x06] PTE Read access is not set
May 19 22:44:52 px1 kernel: DMAR: DRHD: handling fault status reg 102
May 19 22:44:52 px1 kernel: DMAR: [DMA Read NO_PASID] Request device [04:10.0] fault addr 0xe20bb000 [fault reason 0x06] PTE Read access is not set
May 19 22:44:52 px1 QEMU[3776]: kvm: vfio_container_dma_map(0x595d88370dd0, 0xe10ba000, 0x1000, 0x7834682da000) = -28 (No space left on device)
Right now, my iSCSI network broke again, having to restart the NAS VM.
I now try to use the VM without vIOMMU inside (vIOMMU=none, was before intel(amd compatible))...
 
I've been cycling through our pile of Dell R720, R730, R740, and R750 based PVE clusters, and so far no issues to report on the new kernels.
 
as response to myself with the higher power consumtion. Fixed it by installing the latest revision and a couple of reboots. No idea what cause it, but one core stayed stuck on Max Freq. Now its back to normal. (7.0.2-4-pve)

I am still stuck on the higher power consumption. Kernel 7.0.2-6 and Proxmox 9.2.2

What do you mean with installing the latest revision? Revision of what? As I am trying to find a solution.
 
Kernel: 7.0.2-6-pve
HW CPU: AMD GX-424CC


I am experiencing very high CPU usage after a kernel update.
There is a huge overhead in CPU usage in SY time.

perf output:
Code:
+ 41.10% 0.05% [kernel] [k] entry_SYSCALL_64_after_hwframe
+ 41.05% 0.10% [kernel] [k] do_syscall_64
+ 40.59% 0.12% [kernel] [k] x64_sys_call
+ 27.84% 0.58% [kernel] [k] kvm_arch_vcpu_ioctl_run
+ 16.87% 1.50% [kernel] [k] __schedule
+ 15.73% 0.14% [kernel] [k] schedule
+ 12.36% 0.01% [kernel] [k] kvm_vcpu_halt
+ 12.29% 0.05% [kernel] [k] kvm_vcpu_block
+ 8.76% 0.09% [kernel] [k] do_idle
+ 7.49% 0.45% [kernel] [k] try_to_wake_up
+ 7.24% 0.08% [kernel] [k] __kvm_vcpu_kick
+ 7.21% 0.02% [kernel] [k] wake_up_process
+ 7.14% 0.11% [kernel] [k] rcuwait_wake_up
+ 5.68% 0.00% [kernel] [k] call_cpuidle
+ 5.65% 0.22% [kernel] [k] ttwu_do_activate
+ 5.60% 0.09% [kernel] [k] cpuidle_enter_state
+ 5.53% 0.00% libc.so.6 [.] ioctl
+ 5.51% 0.00% [unknown] [k] 0x70632d34365f3638
+ 5.51% 0.00% [unknown] [k] 0x00000000000001b8
+ 5.51% 0.00% [unknown] [k] 0x0000633e38f33140
+ 5.51% 0.00% libglib-2.0.so.0.8400.4 [.] g_free
+ 5.39% 0.00% libc.so.6 [.] 0x000075cf23379b7b
+ 5.36% 0.01% [kernel] [k] asm_exc_page_fault
+ 5.27% 0.00% [kernel] [k] __x64_sys_ioctl
+ 5.27% 0.00% [kernel] [k] kvm_vcpu_ioctl
+ 5.18% 0.03% [kernel] [k] exc_page_fault
+ 4.96% 0.11% [kernel] [k] do_user_addr_fault
+ 4.81% 0.02% [kernel] [k] acpi_idle_enter
+ 4.80% 0.01% [kernel] [k] cpuidle_enter
+ 4.72% 0.65% [kernel] [k] pick_next_task_fair
+ 4.65% 0.07% [kernel] [k] handle_mm_fault
+ 4.64% 0.21% [kernel] [k] enqueue_task
+ 4.56% 0.19% [kernel] [k] __handle_mm_fault
+ 4.55% 0.15% [kernel] [k] dequeue_task
+ 4.50% 0.00% perf [.] 0x000056d1ebedc3f4
+ 4.47% 0.00% perf [.] 0x000056d1ec0538b3
+ 4.36% 0.21% [kernel] [k] dequeue_task_fair
+ 4.14% 0.00% perf [.] 0x000056d1ebedcbe8
+ 4.11% 1.20% [kernel] [k] dequeue_entities
+ 4.09% 1.12% [kernel] [k] pv_native_safe_halt

mitigations=off also does not resolve the issue.

Code:
root@p1 ~ # cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs mitigations=off

root@p1 ~ # cat /proc/cmdline
initrd=\EFI\proxmox\7.0.2-6-pve\initrd.img-7.0.2-6-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs mitigations=off

The issue does not occur on kernel 6.17.
 
Kernel: 7.0.2-6-pve
HW CPU: AMD GX-424CC


I am experiencing very high CPU usage after a kernel update.
There is a huge overhead in CPU usage in SY time.

perf output:
Code:
+ 41.10% 0.05% [kernel] [k] entry_SYSCALL_64_after_hwframe
+ 41.05% 0.10% [kernel] [k] do_syscall_64
+ 40.59% 0.12% [kernel] [k] x64_sys_call
+ 27.84% 0.58% [kernel] [k] kvm_arch_vcpu_ioctl_run
+ 16.87% 1.50% [kernel] [k] __schedule
+ 15.73% 0.14% [kernel] [k] schedule
+ 12.36% 0.01% [kernel] [k] kvm_vcpu_halt
+ 12.29% 0.05% [kernel] [k] kvm_vcpu_block
+ 8.76% 0.09% [kernel] [k] do_idle
+ 7.49% 0.45% [kernel] [k] try_to_wake_up
+ 7.24% 0.08% [kernel] [k] __kvm_vcpu_kick
+ 7.21% 0.02% [kernel] [k] wake_up_process
+ 7.14% 0.11% [kernel] [k] rcuwait_wake_up
+ 5.68% 0.00% [kernel] [k] call_cpuidle
+ 5.65% 0.22% [kernel] [k] ttwu_do_activate
+ 5.60% 0.09% [kernel] [k] cpuidle_enter_state
+ 5.53% 0.00% libc.so.6 [.] ioctl
+ 5.51% 0.00% [unknown] [k] 0x70632d34365f3638
+ 5.51% 0.00% [unknown] [k] 0x00000000000001b8
+ 5.51% 0.00% [unknown] [k] 0x0000633e38f33140
+ 5.51% 0.00% libglib-2.0.so.0.8400.4 [.] g_free
+ 5.39% 0.00% libc.so.6 [.] 0x000075cf23379b7b
+ 5.36% 0.01% [kernel] [k] asm_exc_page_fault
+ 5.27% 0.00% [kernel] [k] __x64_sys_ioctl
+ 5.27% 0.00% [kernel] [k] kvm_vcpu_ioctl
+ 5.18% 0.03% [kernel] [k] exc_page_fault
+ 4.96% 0.11% [kernel] [k] do_user_addr_fault
+ 4.81% 0.02% [kernel] [k] acpi_idle_enter
+ 4.80% 0.01% [kernel] [k] cpuidle_enter
+ 4.72% 0.65% [kernel] [k] pick_next_task_fair
+ 4.65% 0.07% [kernel] [k] handle_mm_fault
+ 4.64% 0.21% [kernel] [k] enqueue_task
+ 4.56% 0.19% [kernel] [k] __handle_mm_fault
+ 4.55% 0.15% [kernel] [k] dequeue_task
+ 4.50% 0.00% perf [.] 0x000056d1ebedc3f4
+ 4.47% 0.00% perf [.] 0x000056d1ec0538b3
+ 4.36% 0.21% [kernel] [k] dequeue_task_fair
+ 4.14% 0.00% perf [.] 0x000056d1ebedcbe8
+ 4.11% 1.20% [kernel] [k] dequeue_entities
+ 4.09% 1.12% [kernel] [k] pv_native_safe_halt

mitigations=off also does not resolve the issue.

Code:
root@p1 ~ # cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs mitigations=off

root@p1 ~ # cat /proc/cmdline
initrd=\EFI\proxmox\7.0.2-6-pve\initrd.img-7.0.2-6-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs mitigations=off

The issue does not occur on kernel 6.17.
This looks like the same issue as Post #97

If cat /sys/devices/system/clocksource/clocksource0/current_clocksource is hpet, and not tsc, then that is your problem, and the fix.
 
This looks like the same issue as Post #97

If cat /sys/devices/system/clocksource/clocksource0/current_clocksource is hpet, and not tsc, then that is your problem, and the fix.
no, read_hpet has 1% usage.
The heavy load here is generated by syscalls related to virtualization and context switching.