[SOLVED] Unstable since proxmox-kernel-6.17 update

sharanah

New Member
Nov 24, 2025
3
0
1
This is a new thread (as suggested), based on: https://forum.proxmox.com/threads/o...le-on-test-no-subscription.173920/post-819343

I've a proxmox cluster of 3 machines, including Ceph (19.2.3) without any subscriptions. The issue only occurs on this specific machine (pashka),

All machines are a from Dell, PowerEdge R340 (1U server). Only the CPU on pashka is different from the other machines. All machines are running 2-6 VMs, without any fancy features (i.e. passthru).

Code:
pashka:  Model name: Intel(R) Xeon(R) E-2126G CPU @ 3.30GHz
medovik: Model name: Intel(R) Xeon(R) E-2124 CPU @ 3.30GHz
oladyi:  Model name: Intel(R) Xeon(R) E-2124 CPU @ 3.30GHz

pashka:  Linux pashka 6.17.2-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.17.2-1 (2025-10-21T11:55Z) x86_64 GNU/Linux
medovik: Linux medovik 6.17.2-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.17.2-1 (2025-10-21T11:55Z) x86_64 GNU/Linux
oladyi:  Linux oladyi 6.17.2-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.17.2-1 (2025-10-21T11:55Z) x86_64 GNU/Linux


1764060733616.png
 

Attachments

Last edited:
All machines are a from Dell, PowerEdge R340 (1U server). Only the CPU on pashka is different from the other machines. All machines are running 2-6 VMs, without any fancy features (i.e. passthru).
please check the BIOS settings especially related to SR-IOV and I/OAT DMA - also try disabling intel_iommu via kernel commandline like @t.lamprecht suggested:

See the known-issues for PVE 9.1 for a bit more information:
https://pve.proxmox.com/wiki/Roadmap#9.1-known-issues

if this does not help - please post the journal with intel_iommu disabled - maybe we get kernel traces with a better pointer to the issue.

I hope this helps!
 
Enabling SR-IOV and I/OAT DMA didn't resolve the issue. I ended up by pinning the previous kernel version:

Bash:
proxmox-boot-tool kernel pin 6.14.11-4-pve
 
I ran into the same issue just now. I didn't enable SR-IOV as that is different than IOMMU. They're related but still different. SR-IOV can serve a portion of a function of a device. Like a portion of a NIC port or an entire NIC port vs IOMMU which is more focused on the slot and grouping it all.

I was on 6.17.2-1, Proxmox community just upgraded to 6.17.4-1 but I'm going to give that shot later on. I went back to 6.14 for now. I'm also on AMD so I hope that helps to debug since it doesn't seem this is Intel specific. I don't need / want SR-IOV, I needed IOMMU for PCI passthrough, I don't anymore but I like to keep my options open since I have a SAS controller there which worked better with OpenMediaVault but anyway.

Details AMD 5600g single CPU, IOMMU enabled, SR-IOV disabled. Pretty much all PCI slots stopped working besides the main one.

x16 slot - Broadcom SAS 3008 controller - worked

x1 slot - Intel x520 - 2x10g - LACP bonded - didn't work,
x1 slot - ASMedia ASM1064 - eSATA controller - didn't work
x1 slot - nVidia - didn't verify

Motherboard RealTek 1g (management network) didn't work. Surprisingly wifi did work but I don't configure that cuz it's wifi.

Doing a modprobe -r r8169 and modprobe r8169 resulted in this message: "kernel: r8169 0000:2a:00.0: error -ENODEV: no MMIO resource found".

Let me know if I can get you any other details but also know I haven't given 6.17.4-1 a shot yet.