Hardware:
- CPU : R7 9700x
- Motherboard : ASUS TUF B650 PLUS (bios version: 3602)
- RAM : Micron DDR5 48GB * 2 (96GB)
- HBA : LSI 9300-16i
- Disks :
1. WD BLACK SN850x 1TB NVME * 2
2. Seagate Exos X24 (ST12000NM002H) 12TB * 8
3. WD Ultrastar DC HC310 (HUS726T4TALA6L4) 4TB * 1 (SATA HDD)
4. WD Ultrastar DC HC310 (HUS726T4TAL5204) 4TB * 1 (SAS HDD)

(additional explanation: /dev/sda ~ /dev/sdj are connected to the HBA)
frequent kernel panics started occurring on a proxmox server after upgrading to the 6.17 kernel, and they continue even after downgrading to 6.14
through LLM suggestions and online searches, the following steps have been tried:
- canceled a ZFS scrub that was running at the time of the crashes.
- checked disk health (SMART reports no errors)
- downgraded the kernel from 6.17 to 6.14.11-5-pve and pinned the kernel
- also tested 6.14.8-2-pve, but kernel panics still occurred.
even after trying the steps above, the issue was not resolved. after further searching and consulting an LLM, there was a suggestion to disable CPU C-states in the bios, so this was disabled and the following additional steps are currently being considered:
- updating the bios (3602 → 3827)
- disabling ZFS prefetch
- reducing ZFS scrub concurrency (zfs_top_maxinflight)

the systems keeps being unstable, so i don't think i can try anything else anymore, what should i do?
- CPU : R7 9700x
- Motherboard : ASUS TUF B650 PLUS (bios version: 3602)
- RAM : Micron DDR5 48GB * 2 (96GB)
- HBA : LSI 9300-16i
- Disks :
1. WD BLACK SN850x 1TB NVME * 2
2. Seagate Exos X24 (ST12000NM002H) 12TB * 8
3. WD Ultrastar DC HC310 (HUS726T4TALA6L4) 4TB * 1 (SATA HDD)
4. WD Ultrastar DC HC310 (HUS726T4TAL5204) 4TB * 1 (SAS HDD)

(additional explanation: /dev/sda ~ /dev/sdj are connected to the HBA)
Bash:
# pveversion -v
proxmox-ve: 9.1.0 (running kernel: 6.14.8-2-pve)
pve-manager: 9.1.6 (running version: 9.1.6/71482d1833ded40a)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17: 6.17.13-1
proxmox-kernel-6.17.13-1-pve: 6.17.13-1
proxmox-kernel-6.17.9-1-pve-signed: 6.17.9-1
proxmox-kernel-6.17.2-1-pve-signed: 6.17.2-1
proxmox-kernel-6.14.11-5-pve-signed: 6.14.11-5
proxmox-kernel-6.14: 6.14.11-5
proxmox-kernel-6.14.8-2-pve: 6.14.8-2
amd64-microcode: 3.20251202.1~bpo13+1
ceph-fuse: 19.2.3-pve4
corosync: 3.1.10-pve1
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx12
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.2
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.7
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.5
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-4
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.4-1
proxmox-backup-file-restore: 4.1.4-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.8
pve-cluster: 9.0.7
pve-container: 6.1.2
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.18-1
pve-ha-manager: 5.1.1
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-7
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.4
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.4.0-pve1
frequent kernel panics started occurring on a proxmox server after upgrading to the 6.17 kernel, and they continue even after downgrading to 6.14
through LLM suggestions and online searches, the following steps have been tried:
- canceled a ZFS scrub that was running at the time of the crashes.
- checked disk health (SMART reports no errors)
- downgraded the kernel from 6.17 to 6.14.11-5-pve and pinned the kernel
- also tested 6.14.8-2-pve, but kernel panics still occurred.
even after trying the steps above, the issue was not resolved. after further searching and consulting an LLM, there was a suggestion to disable CPU C-states in the bios, so this was disabled and the following additional steps are currently being considered:
- updating the bios (3602 → 3827)
- disabling ZFS prefetch
- reducing ZFS scrub concurrency (zfs_top_maxinflight)

the systems keeps being unstable, so i don't think i can try anything else anymore, what should i do?