Proxmox crashing with high IO load on host and in a VM

dannytrigo

New Member
Dec 9, 2025
4
1
3
Hello,

I was running Proxmox 8.2, and occasionally had lock ups/crashes (around every 2 weeks). In an attempt to solve this, I upgraded to Proxmox 9.
However, it now hangs/crashes every few hours.
I've managed to get these logs around the last time it locked up, but I don't always get these.

Does anyone have any ideas if this could be a storage issue (high load with low CPU, so things are blocked on IO)? Or more likely to be RAM or a CPU issue?
Could a reinstall help at all?

Code:
Dec 09 00:19:00 proxmox-02 kernel: Oops: general protection fault, probably for non-canonical address 0x1e194588680008: 0000 [#1] SMP NOPTI
Dec 09 00:19:00 proxmox-02 kernel: CPU: 11 UID: 0 PID: 196 Comm: ksmd Tainted: P S         O        6.17.4-1-pve #1 PREEMPT(voluntary)
Dec 09 00:19:00 proxmox-02 kernel: Tainted: [P]=PROPRIETARY_MODULE, [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE
Dec 09 00:19:00 proxmox-02 kernel: Hardware name: ASUS System Product Name/Z790 GAMING WIFI7, BIOS 1663 08/08/2024
Dec 09 00:19:00 proxmox-02 kernel: RIP: 0010:ksm_get_folio+0x40/0x1e0
Dec 09 00:19:00 proxmox-02 kernel: Code: fc 53 49 83 cc 03 48 83 ec 08 89 75 d4 eb 09 49 8b 47 30 49 39 c6 74 79 4d 8b 77 30 4c 89 f0 48 c1 e0 06 48 03 05 38 ab 59 01 <48> 8b 50 08 48 89 c3 f6 c2 01 0f 85 0c 01 00 00 66 90 48 8b 43 18
Dec 09 00:19:00 proxmox-02 kernel: RSP: 0018:ffffd1640084fd58 EFLAGS: 00010207
Dec 09 00:19:00 proxmox-02 kernel: RAX: 001e194588680000 RBX: ffff8dd42a2c1180 RCX: 0000000000000001
Dec 09 00:19:00 proxmox-02 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8dd42a4ef168
Dec 09 00:19:00 proxmox-02 kernel: RBP: ffffd1640084fd88 R08: 0000000000000000 R09: 0000000000000000
Dec 09 00:19:00 proxmox-02 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8dd42a4ef16b
Dec 09 00:19:00 proxmox-02 kernel: R13: ffff8dd42a4ef168 R14: 00007895a421a000 R15: ffff8dd42a4ef168
Dec 09 00:19:00 proxmox-02 kernel: FS:  0000000000000000(0000) GS:ffff8dd544d06000(0000) knlGS:0000000000000000
Dec 09 00:19:00 proxmox-02 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 09 00:19:00 proxmox-02 kernel: CR2: 00007fb4723b407d CR3: 00000003b5e3a006 CR4: 0000000000f72ef0
Dec 09 00:19:00 proxmox-02 kernel: PKRU: 55555554
Dec 09 00:19:00 proxmox-02 kernel: Call Trace:
Dec 09 00:19:00 proxmox-02 kernel:  <TASK>
Dec 09 00:19:00 proxmox-02 kernel:  remove_rmap_item_from_tree+0x74/0x150
Dec 09 00:19:00 proxmox-02 kernel:  ksm_scan_thread+0x653/0x2600
Dec 09 00:19:00 proxmox-02 kernel:  ? __pfx_ksm_scan_thread+0x10/0x10
Dec 09 00:19:00 proxmox-02 kernel:  kthread+0x108/0x220
Dec 09 00:19:00 proxmox-02 kernel:  ? __pfx_kthread+0x10/0x10
Dec 09 00:19:00 proxmox-02 kernel:  ret_from_fork+0x205/0x240
Dec 09 00:19:00 proxmox-02 kernel:  ? __pfx_kthread+0x10/0x10
Dec 09 00:19:00 proxmox-02 kernel:  ret_from_fork_asm+0x1a/0x30
Dec 09 00:19:00 proxmox-02 kernel:  </TASK>
Dec 09 00:19:00 proxmox-02 kernel: Modules linked in: veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables bonding tls softdog sunrpc nfnetlink_log binfmt_misc snd_hda_codec_intelhdmi snd_hda_codec_alc662 snd_hda_codec_realtek_lib snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_>
Dec 09 00:19:00 proxmox-02 kernel:  sch_fq_codel coretemp btrtl snd_hda_core mt792x_lib btintel snd_intel_dspcfg mt76_connac_lib btbcm snd_intel_sdw_acpi kvm_intel snd_hwdep btmtk mt76 kvm snd_soc_core mac80211 bluetooth snd_compress ac97_bus drm_buddy snd_pcm_dmaengine ttm polyval_clmulni snd_pcm mei_hdcp mei_pxp ghash_clmulni_intel intel_pmc_core drm_displa>
Dec 09 00:19:00 proxmox-02 kernel:  spi_intel_pci intel_lpss_pci i2c_smbus realtek scsi_transport_sas spi_intel nvme_keyring intel_lpss vmd nvme_auth idma64 video pinctrl_alderlake wmi
Dec 09 00:19:00 proxmox-02 kernel: ---[ end trace 0000000000000000 ]---
Dec 09 00:19:00 proxmox-02 kernel: RIP: 0010:ksm_get_folio+0x40/0x1e0
Dec 09 00:19:00 proxmox-02 kernel: Code: fc 53 49 83 cc 03 48 83 ec 08 89 75 d4 eb 09 49 8b 47 30 49 39 c6 74 79 4d 8b 77 30 4c 89 f0 48 c1 e0 06 48 03 05 38 ab 59 01 <48> 8b 50 08 48 89 c3 f6 c2 01 0f 85 0c 01 00 00 66 90 48 8b 43 18
Dec 09 00:19:00 proxmox-02 kernel: RSP: 0018:ffffd1640084fd58 EFLAGS: 00010207
Dec 09 00:19:00 proxmox-02 kernel: RAX: 001e194588680000 RBX: ffff8dd42a2c1180 RCX: 0000000000000001
Dec 09 00:19:00 proxmox-02 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8dd42a4ef168
Dec 09 00:19:00 proxmox-02 kernel: RBP: ffffd1640084fd88 R08: 0000000000000000 R09: 0000000000000000
Dec 09 00:19:00 proxmox-02 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8dd42a4ef16b
Dec 09 00:19:00 proxmox-02 kernel: R13: ffff8dd42a4ef168 R14: 00007895a421a000 R15: ffff8dd42a4ef168
Dec 09 00:19:00 proxmox-02 kernel: FS:  0000000000000000(0000) GS:ffff8dd544d06000(0000) knlGS:0000000000000000
Dec 09 00:19:00 proxmox-02 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 09 00:19:00 proxmox-02 kernel: CR2: 00007fb4723b407d CR3: 00000003b5e3a006 CR4: 0000000000f72ef0
Dec 09 00:19:00 proxmox-02 kernel: PKRU: 55555554
 
It seems it may be faulty RAM:


Code:
                         Memtest86+ v7.20

CLK/Temp: 3417MHz        90/100°C
L1 Cache:    48KB        646 GB/s
L2 Cache:     2MB        140 GB/s
L3 Cache:    33MB        61.7 GB/s
Memory :    31.7GB       25.2 GB/s

                     Intel(R) Core(TM) i7-14700K
Pass 31:  ###########################
Test 46:  ###########################
Test #6   [Moving inversions, 64 bit pattern]
Testing:  13GB - 14GB (1GB of 31.7GB)
Pattern:  0x0000000000000004

CPU: 8P+12E-Cores (28T)    SMP: 16T (PAR)
IMC: DDR5-4800 / CAS 40-40-40-76

Time: 0:32:13             Status: Failed !
Pass: 1                   Errors: 6

CPU   Pass   Test   Failing Address          Expected                Found
----  ----   ----   ---------------          --------                -----
  1     1      6    00024252c800 (9.03GB)    ffffffffffffffff        54ffffffffdeffff
  0     1      6    0002400a4240 (9GB)       ffbfbfbfffffffff        86ffbfbf4ffffff
  6     1      6    00024dbc4400 (9.21GB)    fbffffffffffffff        cbfbfffd0fffffff
 27     1      6    00027dba2e80 (9.96GB)    ffffffffffffffff        70ffff59feffff
 10     1      6    00025ce08900 (9.35GB)    ffffffffffbfbf          deffffffff56fbfff
 15     1      6    0002624c5a40 (9.53GB)    fffffffeffffffff        13fffff4ffffffff

<ESC> Exit     F1 Configuration     <Space> Scroll unlock          7.20_unkno.x64
 
  • Like
Reactions: leesteken
Intel(R) Core(TM) i7-14700K
While it is entirely possible that you do in fact have actual physical RAM issues, I would not rule out that you are suffering from the notoriously documented issues associated with that i7-14700K CPU. This would also be the cause for those RAM errors shown.

I don't remember clearly the whole saga with those 14th gen chips, but if I recall it was usually linked with the available Intel firmware used with them - often causing irreversible damage to those chips.

Be sure to have the latest Intel microcode installed for that chip.

What is the prior history of that chip - did someone use it in an over-clocking situation etc.?

Anyway - try changing out the RAM for some know good modules & retest.
 
Thanks for the reply. I have never overclocked it. I bought the motherboard, CPU and RAM in June. Always left at the BIOS defaults. Hadn't really suffered any issues until the last month or so when I was getting crashes. So I updated to Proxmox 9 and now its crashing frequently.

I've just tried testing the RAM with one stick installed at a time, and then it gets even more errors (hundreds) no matter which 16GB stick is installed.

I have another system that may have compatible RAM - I'm going to try swapping between them. And I'll look into how to update the CPU microcode, thanks.

Seems like I may be looking at a CPU warranty claim? I think I saw the process documented somewhere on these forums