Just built a new Proxmox node and it randomly dies. I have tried to find what's the issue but really heading nowhere.
Hardware:
1. Base: Lenovo P3 Ultra Gen 2
2. CPU: Intel Core Ultra 5 235 (Arrow Lake-S)
3. Memory: SK Hynix 2x 32GB SODIMM ECC
4. Drive: Micron 7450 Pro 960GB M.2 2280
5. Graphics Card: Intel Pro B50 Pro
Proxmox Info
Kernel: 6.17.4-2-pve
PVE version: 9.1.4
Here are a couple issues that I have encountered.
What I did:
1. Lenovo has a issue diagnostic thing that tests memory, cpu, motherboard, and storage. It passed all of them. (The test is very long, took around 8 hours.)
2. Passed memtest86+
3. No VM are running when these happen
4. Graphic card does not seem to be the issue as these issue exist after I removed the graphics card
What I suspect:
1. The hard drive, Micron 7450 Pro is a secondhand device. There might be some stability issue with that? But why all these segfault and stuff, it doesn't seem to be connected my issue.
2. The CPU is having issues? It is a pretty new CPU.
3. Anything else I missed?
Great thanks in advance!
Hardware:
1. Base: Lenovo P3 Ultra Gen 2
2. CPU: Intel Core Ultra 5 235 (Arrow Lake-S)
3. Memory: SK Hynix 2x 32GB SODIMM ECC
4. Drive: Micron 7450 Pro 960GB M.2 2280
5. Graphics Card: Intel Pro B50 Pro
Proxmox Info
Kernel: 6.17.4-2-pve
PVE version: 9.1.4
Here are a couple issues that I have encountered.
Seems like CPU soft lockup? After this Proxmox stopped working, cannot seem to reproduce. System hangs after that.
Code:
Jan 07 19:03:57 compute kernel: watchdog: BUG: soft lockup - CPU#5 stuck for 144s! [swapper/5:0]
Jan 07 19:03:57 compute kernel: Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables bonding tls softdog sunrpc nfnetlink_log binfmt_misc mtd_intel_dg pmt_crashlog mei_gsc snd_hda_codec_intelhdmi snd_hda_codec_alc269 snd_hda_scodec_component snd_hda_codec_realtek_lib snd_hda_codec_generic snd_sof_pci_intel_mtl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_hda_codec_hdmi soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp intel_uncore_frequency snd_sof intel_uncore_frequency_common snd_sof_utils intel_pmc_core snd_hda_ext_core x86_pkg_temp_thermal intel_powerclamp snd_soc_acpi_intel_match processor_thermal_device_pci snd_soc_acpi_intel_sdca_quirks coretemp xe soundwire_generic_allocation processor_thermal_device sch_fq_codel processor_thermal_wt_hint snd_soc_acpi kvm_intel snd_hda_intel platform_temperature_control gpu_sched
Jan 07 19:03:57 compute kernel: processor_thermal_soc_slider soundwire_bus drm_gpuvm processor_thermal_rfim snd_hda_codec kvm snd_soc_sdca i915 drm_gpusvm_helper intel_rapl_msr processor_thermal_rapl btusb snd_hda_core snd_soc_core irqbypass drm_ttm_helper intel_rapl_common btrtl snd_intel_dspcfg snd_compress lenovo_wmi_other polyval_clmulni drm_buddy drm_exec ttm drm_suballoc_helper ac97_bus btintel processor_thermal_wt_req lenovo_wmi_helpers ghash_clmulni_intel snd_intel_sdw_acpi aesni_intel snd_pcm_dmaengine drm_display_helper pmt_telemetry zfs(PO) btbcm processor_thermal_power_floor lenovo_wmi_capdata01 snd_hwdep rapl cmdlinepart snd_pcm mei_gsc_proxy cec pmt_discovery pmt_class btmtk think_lmi processor_thermal_mbox intel_cstate iwlwifi firmware_attributes_class spi_nor snd_timer rc_core snd mei_me pcspkr intel_pmc_ssram_telemetry int3400_thermal int340x_thermal_zone bluetooth wmi_bmof crc8 mtd mei soundcore intel_vpu i2c_algo_bit platform_profile cfg80211 acpi_tad acpi_thermal_rel acpi_pad intel_vsec joydev input_leds mac_hid
Jan 07 19:03:57 compute kernel: spl(O) msr vhost_net vhost vhost_iotlb tap efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic usbkbd usbmouse uas usbhid usb_storage hid dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio nvme i2c_i801 xhci_pci nvme_core i2c_mux spi_intel_pci ahci nvme_keyring xhci_hcd e1000e video i2c_smbus spi_intel libahci nvme_auth wmi
Jan 07 19:03:57 compute kernel: CPU: 5 UID: 0 PID: 0 Comm: swapper/5 Tainted: P O L 6.17.4-2-pve #1 PREEMPT(voluntary)
Jan 07 19:03:57 compute kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [L]=SOFTLOCKUP
Jan 07 19:03:57 compute kernel: Hardware name: LENOVO 30J6000PUS/1070, BIOS S0NKT17A 11/20/2025
Jan 07 19:03:57 compute kernel: RIP: 0010:cpuidle_enter_state+0xc7/0x460
Jan 07 19:03:57 compute kernel: Code: 00 e8 ed 8e fa fe e8 08 f1 ff ff 49 89 c7 0f 1f 44 00 00 31 ff e8 a9 f5 f8 fe 80 7d d7 00 0f 85 d5 01 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 86 01 00 00 49 63 d6 4c 89 f9 48 8d 04 52 48 8d 04
Jan 07 19:03:57 compute kernel: RSP: 0018:ffffd25f80233e40 EFLAGS: 00000246
Jan 07 19:03:57 compute kernel: RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
Jan 07 19:03:57 compute kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 07 19:03:57 compute kernel: RBP: ffffd25f80233e78 R08: 0000000000000000 R09: 0000000000000000
Jan 07 19:03:57 compute kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c28b00bf740
Jan 07 19:03:57 compute kernel: R13: ffffffffac28f960 R14: 0000000000000003 R15: 00001c4bba6c8ca5
Jan 07 19:03:57 compute kernel: FS: 0000000000000000(0000) GS:ffff8c2903806000(0000) knlGS:0000000000000000
Jan 07 19:03:57 compute kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 07 19:03:57 compute kernel: CR2: 000076bd5aa91000 CR3: 000000023303a005 CR4: 0000000000f72ef0
Jan 07 19:03:57 compute kernel: PKRU: 55555554
Jan 07 19:03:57 compute kernel: Call Trace:
Jan 07 19:03:57 compute kernel: <TASK>
Jan 07 19:03:57 compute kernel: cpuidle_enter+0x2e/0x50
Jan 07 19:03:57 compute kernel: call_cpuidle+0x22/0x60
Jan 07 19:03:57 compute kernel: ? cpu_startup_entry+0x29/0x30
Jan 07 19:03:57 compute kernel: ? start_secondary+0x118/0x140
Jan 07 19:03:57 compute kernel: ? common_startup_64+0x13e/0x141
Jan 07 19:03:57 compute kernel: </TASK>
It seems pveproxy is causing segfault while restarting. However I tried to manually stop, start, restart, and reload pveproxy but it did not trigger this issue again. System hangs after this log ends.
Code:
Jan 09 00:00:07 compute systemd[1]: Starting dpkg-db-backup.service - Daily dpkg database backup service...
Jan 09 00:00:07 compute systemd[1]: dpkg-db-backup.service: Deactivated successfully.
Jan 09 00:00:07 compute systemd[1]: Finished dpkg-db-backup.service - Daily dpkg database backup service.
Jan 09 00:16:46 compute pvedaemon[6395]: <root@pam> successful auth for user 'root@pam'
Jan 09 00:16:50 compute pveproxy[42619]: worker exit
Jan 09 00:16:50 compute pveproxy[1321]: worker 42619 finished
Jan 09 00:16:50 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 00:16:50 compute pveproxy[1321]: worker 66480 started
Jan 09 00:17:01 compute CRON[66508]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Jan 09 00:17:01 compute CRON[66510]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Jan 09 00:17:01 compute CRON[66508]: pam_unix(cron:session): session closed for user root
Jan 09 00:21:42 compute pveproxy[43240]: worker exit
Jan 09 00:21:42 compute pveproxy[1321]: worker 43240 finished
Jan 09 00:21:42 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 00:21:42 compute pveproxy[1321]: worker 67259 started
Jan 09 00:24:01 compute CRON[67639]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Jan 09 00:24:01 compute CRON[67641]: (root) CMD (if [ $(date +%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ]; then /usr/lib/zfs-linux/scrub; fi)
Jan 09 00:24:01 compute CRON[67639]: pam_unix(cron:session): session closed for user root
Jan 09 00:26:44 compute systemd[1]: Starting logrotate.service - Rotate log files...
Jan 09 00:26:44 compute systemd[1]: Reloading pveproxy.service - PVE API Proxy Server...
Jan 09 00:26:44 compute pveproxy[68093]: send HUP to 1321
Jan 09 00:26:44 compute pveproxy[1321]: received signal HUP
Jan 09 00:26:44 compute pveproxy[1321]: server closing
Jan 09 00:26:44 compute pveproxy[1321]: server shutdown (restart)
Jan 09 00:26:44 compute systemd[1]: Reloaded pveproxy.service - PVE API Proxy Server.
Jan 09 00:26:44 compute systemd[1]: Reloading spiceproxy.service - PVE SPICE Proxy Server...
Jan 09 00:26:44 compute spiceproxy[68096]: send HUP to 1331
Jan 09 00:26:44 compute spiceproxy[1331]: received signal HUP
Jan 09 00:26:44 compute spiceproxy[1331]: server closing
Jan 09 00:26:44 compute spiceproxy[1331]: server shutdown (restart)
Jan 09 00:26:44 compute systemd[1]: Reloaded spiceproxy.service - PVE SPICE Proxy Server.
Jan 09 00:26:44 compute pvefw-logger[3376]: received terminate request (signal)
Jan 09 00:26:44 compute systemd[1]: Stopping pvefw-logger.service - Proxmox VE firewall logger...
Jan 09 00:26:44 compute pvefw-logger[3376]: stopping pvefw logger
Jan 09 00:26:45 compute spiceproxy[1331]: restarting server
Jan 09 00:26:45 compute spiceproxy[1331]: starting 1 worker(s)
Jan 09 00:26:45 compute spiceproxy[1331]: worker 68111 started
Jan 09 00:26:45 compute systemd[1]: pvefw-logger.service: Deactivated successfully.
Jan 09 00:26:45 compute systemd[1]: Stopped pvefw-logger.service - Proxmox VE firewall logger.
Jan 09 00:26:45 compute systemd[1]: pvefw-logger.service: Consumed 3.303s CPU time, 1.8M memory peak.
Jan 09 00:26:45 compute systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Jan 09 00:26:45 compute pvefw-logger[68113]: starting pvefw logger
Jan 09 00:26:45 compute systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
Jan 09 00:26:45 compute systemd[1]: logrotate.service: Deactivated successfully.
Jan 09 00:26:45 compute systemd[1]: Finished logrotate.service - Rotate log files.
Jan 09 00:26:45 compute pveproxy[1321]: restarting server
Jan 09 00:26:45 compute pveproxy[1321]: starting 3 worker(s)
Jan 09 00:26:45 compute pveproxy[1321]: worker 68121 started
Jan 09 00:26:45 compute pveproxy[1321]: worker 68122 started
Jan 09 00:26:45 compute pveproxy[1321]: worker 68123 started
Jan 09 00:26:46 compute kernel: show_signal_msg: 115 callbacks suppressed
Jan 09 00:26:46 compute kernel: pveproxy worker[68122]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 10 (core 12, socket 0)
Jan 09 00:26:46 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 00:26:46 compute pveproxy[1321]: worker 68122 finished
Jan 09 00:26:46 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 00:26:46 compute pveproxy[67259]: worker exit
Jan 09 00:26:46 compute pveproxy[1321]: worker 68126 started
Jan 09 00:26:46 compute kernel: pveproxy worker[68123]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 12 (core 14, socket 0)
Jan 09 00:26:46 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
...
Jan 09 00:26:48 compute pveproxy[68128]: got inotify poll request in wrong process - disabling inotify
Jan 09 00:26:50 compute spiceproxy[6405]: worker exit
Jan 09 00:26:50 compute spiceproxy[1331]: worker 6405 finished
Jan 09 00:26:50 compute kernel: pveproxy worker[68126]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4
Jan 09 00:26:50 compute kernel: pveproxy worker[68129]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4
[... a lot of segfault in the middle]
Jan 09 01:45:25 compute kernel: pveproxy worker[86603]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 13 (core 15, socket 0)
Jan 09 01:45:25 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:25 compute pveproxy[1321]: worker 86603 finished
Jan 09 01:45:25 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:25 compute pveproxy[1321]: worker 86611 started
Jan 09 01:45:26 compute kernel: pveproxy worker[86604]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 13 (core 15, socket 0)
Jan 09 01:45:26 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:26 compute pveproxy[1321]: worker 86604 finished
Jan 09 01:45:26 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:26 compute pveproxy[1321]: worker 86612 started
Jan 09 01:45:26 compute kernel: pveproxy worker[86611]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 11 (core 13, socket 0)
Jan 09 01:45:26 compute kernel: pveproxy worker[86610]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 12 (core 14, socket 0)
Jan 09 01:45:26 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:26 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:26 compute pveproxy[1321]: worker 86610 finished
Jan 09 01:45:26 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:26 compute pveproxy[1321]: worker 86613 started
Jan 09 01:45:26 compute pveproxy[1321]: worker 86611 finished
Jan 09 01:45:26 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:26 compute pveproxy[1321]: worker 86614 started
Jan 09 01:45:26 compute kernel: pveproxy worker[86612]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 13 (core 15, socket 0)
Jan 09 01:45:26 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:26 compute pveproxy[1321]: worker 86612 finished
Jan 09 01:45:26 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:26 compute pveproxy[1321]: worker 86615 started
Jan 09 01:45:27 compute pveproxy[1321]: worker 86613 finished
Jan 09 01:45:27 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:27 compute pveproxy[1321]: worker 86618 started
Jan 09 01:45:27 compute kernel: show_signal_msg: 1 callbacks suppressed
Jan 09 01:45:27 compute kernel: pveproxy worker[86614]: segfault at 220febc435c8 ip 0000610fda0ffc93 sp 00007ffc442d2590 error 4 in perl[188c93,610fd9fbb000+1ae000] likely on CPU 11 (core 13, socket 0)
Jan 09 01:45:27 compute kernel: Code: f7 fd ff ff 31 db e9 98 fd ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 8b 47 08 48 89 fb 0f 1f 40 00 48 89 df <ff> 50 10 48 89 43 08 48 85 c0 75 f1 8b 83 4c 05 00 00 85 c0 75 0f
Jan 09 01:45:27 compute pveproxy[1321]: worker 86614 finished
Jan 09 01:45:27 compute pveproxy[1321]: starting 1 worker(s)
Jan 09 01:45:27 compute pveproxy[1321]: worker 86619 started
I do not have a screenshot of it or logs of it. The symptom is that the first line in the console of Proxmox (in the HDMI output) becomes purple and shows .--. _
This type of crash however, unlike the two before, it won't hang the system. It will automatically reboot in around 10 seconds.
The System Log before this happen does not have anything weird in it (no segfault or anything)
This type of crash however, unlike the two before, it won't hang the system. It will automatically reboot in around 10 seconds.
The System Log before this happen does not have anything weird in it (no segfault or anything)
What I did:
1. Lenovo has a issue diagnostic thing that tests memory, cpu, motherboard, and storage. It passed all of them. (The test is very long, took around 8 hours.)
2. Passed memtest86+
3. No VM are running when these happen
4. Graphic card does not seem to be the issue as these issue exist after I removed the graphics card
What I suspect:
1. The hard drive, Micron 7450 Pro is a secondhand device. There might be some stability issue with that? But why all these segfault and stuff, it doesn't seem to be connected my issue.
2. The CPU is having issues? It is a pretty new CPU.
3. Anything else I missed?
Great thanks in advance!
Last edited:



