CRASHING - 2 of my 3 machines ver 7 randomly crash

JerryOH

Member
Sep 7, 2021
5
0
6
62
HELP, not sure what to do or what to send. TWO of my three machines are crashing randomly within 72 hours of turning on.
Can ping them but no web interface, VM's are not running and no SSH. Power reset brings everything back to normal.

CPU(s) 8 x AMD Ryzen 5 2400G with Radeon Vega Graphics (1 Socket)
Kernel Version Linux 5.11.22-4-pve #1 SMP PVE 5.11.22-8 (Fri, 27 Aug 2021 11:51:34 +0200)
PVE Manager Version pve-manager/7.0-11/63d82f4e
Repository Status Proxmox VE updates Production-ready Enterprise repository enabled
 
From reading the above and others on the forum, it would appear I'm running the latest kernel and I'm still having an issue.
FYI: I just installed my three machines last week.

Any other suggestions or anything I can send or post to assist in figuring it out?
 
Not sure if this helps but this is in the syslog a few minutes before the crash



Sep 14 22:29:44 pve2 systemd[1]: Stopped User Runtime Directory /run/user/0.
Sep 14 22:29:44 pve2 systemd[1]: Removed slice User Slice of UID 0.
Sep 14 22:30:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:30:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:30:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
Sep 14 22:31:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:31:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:31:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
Sep 14 22:32:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:32:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:32:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
Sep 14 22:33:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:33:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:33:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
Sep 14 22:34:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:34:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:34:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
Sep 14 22:34:39 pve2 kernel: BUG: stack guard page was hit at 000000004d9b54bd (stack is 00000000fcad5f84..0000000099f54ea6)
Sep 14 22:34:40 pve2 kernel: kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
Sep 14 22:34:40 pve2 kernel: CPU: 4 PID: 1316 Comm: pvestatd Tainted: P O 5.11.22-4-pve #1
Sep 14 22:34:40 pve2 kernel: Hardware name: MicroElectronics B241/PRIME A320M-K, BIOS 9102 07/11/2018
Sep 14 22:34:40 pve2 kernel: RIP: 0010:__memcpy+0x12/0x20
Sep 14 22:34:40 pve2 kernel: Code: cc cc cc cc cc cc cc cc 48 8b 05 c1 3a 70 01 c3 cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4
Sep 14 22:34:40 pve2 kernel: RSP: 0018:ffffa2fc816e3d90 EFLAGS: 00010246
Sep 14 22:34:40 pve2 kernel: RAX: ffff9337982f3080 RBX: ffffa2fc816e3e30 RCX: 00000000000002ce
Sep 14 22:34:40 pve2 kernel: RDX: 0000000000000000 RSI: ffffa2fc816e4000 RDI: ffff9337982f3250
Sep 14 22:34:40 pve2 kernel: RBP: ffffa2fc816e3da0 R08: 0000000000000000 R09: ffffa2fc8c4b0000
Sep 14 22:34:40 pve2 kernel: R10: 0000000000000004 R11: ffffa2fc80000000 R12: ffffa2fc816e3e30
Sep 14 22:34:40 pve2 kernel: R13: 0000000000000000 R14: ffffa2fc816e3eb0 R15: ffff9337982f3080
Sep 14 22:34:40 pve2 kernel: FS: 00007f5bf221b280(0000) GS:ffff933e3fd00000(0000) knlGS:0000000000000000
Sep 14 22:34:40 pve2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 22:34:40 pve2 kernel: CR2: ffffa2fc816e4000 CR3: 000000011daec000 CR4: 00000000003506e0
Sep 14 22:34:40 pve2 kernel: Call Trace:
Sep 14 22:34:40 pve2 kernel: ? arch_dup_task_struct+0x1a/0x30
Sep 14 22:34:40 pve2 kernel: copy_process+0x2e6/0x1c10
Sep 14 22:34:40 pve2 kernel: ? handle_mm_fault+0x11b7/0x1a70
Sep 14 22:34:40 pve2 kernel: ? __alloc_file+0x90/0xe0
Sep 14 22:34:40 pve2 kernel: kernel_clone+0x9d/0x3e0
Sep 14 22:34:40 pve2 kernel: __do_sys_clone+0x5d/0x80
Sep 14 22:34:40 pve2 kernel: __x64_sys_clone+0x25/0x30
Sep 14 22:34:40 pve2 kernel: do_syscall_64+0x38/0x90
Sep 14 22:34:40 pve2 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Sep 14 22:34:40 pve2 kernel: RIP: 0033:0x7f5bf23224bb
Sep 14 22:34:40 pve2 kernel: Code: ed 0f 85 f8 00 00 00 64 4c 8b 0c 25 10 00 00 00 45 31 c0 4d 8d 91 d0 02 00 00 31 d2 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 41 89 c5 85 c0 0f 85 9e 00 00
Sep 14 22:34:40 pve2 kernel: RSP: 002b:00007ffe79506400 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Sep 14 22:34:40 pve2 kernel: RAX: ffffffffffffffda RBX: 000055d4425b22a0 RCX: 00007f5bf23224bb
Sep 14 22:34:40 pve2 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
Sep 14 22:34:40 pve2 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f5bf221b280
Sep 14 22:34:40 pve2 kernel: R10: 00007f5bf221b550 R11: 0000000000000246 R12: 0000000000000000
Sep 14 22:34:40 pve2 kernel: R13: 00007ffe79506430 R14: 0000000000000000 R15: 000055d446f4bf60
Sep 14 22:34:40 pve2 kernel: Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace nfs_ssc fscache ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables sctp ip6_udp_tunnel udp_tunnel iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio edac_mce_amd snd_hda_codec_hdmi snd_hda_intel amdgpu snd_intel_dspcfg kvm_amd soundwire_intel iwlmvm soundwire_generic_allocation kvm soundwire_cadence irqbypass iommu_v2 snd_hda_codec gpu_sched crct10dif_pclmul mac80211 ghash_clmulni_intel drm_ttm_helper snd_hda_core libarc4 snd_hwdep aesni_intel soundwire_bus ttm crypto_simd cryptd snd_soc_core glue_helper snd_compress iwlwifi rapl ac97_bus drm_kms_helper snd_pcm_dmaengine cec snd_pcm eeepc_wmi snd_timer rc_core asus_wmi sparse_keymap i2c_algo_bit wmi_bmof efi_pstore pcspkr fb_sys_fops snd syscopyarea cfg80211 sysfillrect k10temp sysimgblt soundcore ccp mac_hid zfs(PO)
Sep 14 22:34:40 pve2 kernel: zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32_pclmul i2c_piix4 xhci_pci xhci_pci_renesas r8169 realtek ahci xhci_hcd libahci gpio_amdpt wmi video gpio_generic
Sep 14 22:34:40 pve2 kernel: ---[ end trace fb7b29bcd76f6c89 ]---
Sep 14 22:34:40 pve2 kernel: RIP: 0010:__memcpy+0x12/0x20
Sep 14 22:34:40 pve2 kernel: Code: cc cc cc cc cc cc cc cc 48 8b 05 c1 3a 70 01 c3 cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4
Sep 14 22:34:40 pve2 kernel: RSP: 0018:ffffa2fc816e3d90 EFLAGS: 00010246
Sep 14 22:34:40 pve2 kernel: RAX: ffff9337982f3080 RBX: ffffa2fc816e3e30 RCX: 00000000000002ce
Sep 14 22:34:40 pve2 kernel: RDX: 0000000000000000 RSI: ffffa2fc816e4000 RDI: ffff9337982f3250
Sep 14 22:34:40 pve2 kernel: RBP: ffffa2fc816e3da0 R08: 0000000000000000 R09: ffffa2fc8c4b0000
Sep 14 22:34:40 pve2 kernel: R10: 0000000000000004 R11: ffffa2fc80000000 R12: ffffa2fc816e3e30
Sep 14 22:34:40 pve2 kernel: R13: 0000000000000000 R14: ffffa2fc816e3eb0 R15: ffff9337982f3080
Sep 14 22:34:40 pve2 kernel: FS: 00007f5bf221b280(0000) GS:ffff933e3fd00000(0000) knlGS:0000000000000000
Sep 14 22:34:40 pve2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 22:34:40 pve2 kernel: CR2: ffffa2fc816e4000 CR3: 000000011daec000 CR4: 00000000003506e0
Sep 14 22:34:40 pve2 kernel: BUG: stack guard page was hit at 0000000095a14b0f (stack is 0000000021e9b3a1..000000001a239e64)
Sep 14 22:34:40 pve2 kernel: kernel stack overflow (page fault): 0000 [#2] SMP NOPTI
Sep 14 22:34:40 pve2 kernel: CPU: 2 PID: 407 Comm: systemd-journal Tainted: P D O 5.11.22-4-pve #1
Sep 14 22:34:40 pve2 kernel: Hardware name: MicroElectronics B241/PRIME A320M-K, BIOS 9102 07/11/2018
Sep 14 22:34:40 pve2 kernel: RIP: 0010:__memcpy+0x12/0x20
Sep 14 22:34:40 pve2 kernel: Code: cc cc cc cc cc cc cc cc 48 8b 05 c1 3a 70 01 c3 cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4
Sep 14 22:34:40 pve2 kernel: RSP: 0018:ffffa2fc803bfd90 EFLAGS: 00010246
Sep 14 22:34:40 pve2 kernel: RAX: ffff93374d9b1840 RBX: ffffa2fc803bfe30 RCX: 00000000000002ce
Sep 14 22:34:40 pve2 kernel: RDX: 0000000000000000 RSI: ffffa2fc803c0000 RDI: ffff93374d9b1a10
Sep 14 22:34:40 pve2 kernel: RBP: ffffa2fc803bfda0 R08: 0000000000000000 R09: ffffa2fc8c3b0000
Sep 14 22:34:40 pve2 kernel: R10: 00000000000049d0 R11: 0000000000000000 R12: ffffa2fc803bfe30
Sep 14 22:34:40 pve2 kernel: R13: 0000000000000000 R14: ffffa2fc803bfeb0 R15: ffff93374d9b1840
Sep 14 22:34:40 pve2 kernel: FS: 00007fee8f401900(0000) GS:ffff933e3fc80000(0000) knlGS:0000000000000000
Sep 14 22:34:40 pve2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 22:34:40 pve2 kernel: CR2: ffffa2fc803c0000 CR3: 0000000111344000 CR4: 00000000003506e0
Sep 14 22:34:40 pve2 kernel: Call Trace:
Sep 14 22:34:40 pve2 kernel: ? arch_dup_task_struct+0x1a/0x30
Sep 14 22:34:40 pve2 kernel: copy_process+0x2e6/0x1c10
Sep 14 22:34:40 pve2 kernel: kernel_clone+0x9d/0x3e0
Sep 14 22:34:40 pve2 kernel: __do_sys_clone+0x5d/0x80
Sep 14 22:34:40 pve2 kernel: __x64_sys_clone+0x25/0x30
Sep 14 22:34:40 pve2 kernel: do_syscall_64+0x38/0x90
Sep 14 22:34:40 pve2 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Sep 14 22:34:40 pve2 kernel: RIP: 0033:0x7fee8fcc9de1
Sep 14 22:34:40 pve2 kernel: Code: 48 85 ff 74 3d 48 85 f6 74 38 48 83 ee 10 48 89 4e 08 48 89 3e 48 89 d7 4c 89 c2 4d 89 c8 4c 8b 54 24 08 b8 38 00 00 00 0f 05 <48> 85 c0 7c 13 74 01 c3 31 ed 58 5f ff d0 48 89 c7 b8 3c 00 00 00
Sep 14 22:34:40 pve2 kernel: RSP: 002b:00007ffe7f4ef6f8 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
Sep 14 22:34:40 pve2 kernel: RAX: ffffffffffffffda RBX: 00007fee8e400700 RCX: 00007fee8fcc9de1
Sep 14 22:34:40 pve2 kernel: RDX: 00007fee8e4009d0 RSI: 00007fee8e3ffdf0 RDI: 00000000003d0f00
Sep 14 22:34:40 pve2 kernel: RBP: 00007ffe7f4ef7b0 R08: 00007fee8e400700 R09: 00007fee8e400700
Sep 14 22:34:40 pve2 kernel: R10: 00007fee8e4009d0 R11: 0000000000000206 R12: 00007ffe7f4ef7ae
Sep 14 22:34:40 pve2 kernel: R13: 00007ffe7f4ef7af R14: 00007fee8e3ffe00 R15: 0000563aea560fc0
Sep 14 22:34:40 pve2 kernel: Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace nfs_ssc fscache ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables sctp ip6_udp_tunnel udp_tunnel iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio edac_mce_amd snd_hda_codec_hdmi snd_hda_intel amdgpu snd_intel_dspcfg kvm_amd soundwire_intel iwlmvm soundwire_generic_allocation kvm soundwire_cadence irqbypass iommu_v2 snd_hda_codec gpu_sched crct10dif_pclmul mac80211 ghash_clmulni_intel drm_ttm_helper snd_hda_core libarc4 snd_hwdep aesni_intel soundwire_bus ttm crypto_simd cryptd snd_soc_core glue_helper snd_compress iwlwifi rapl ac97_bus drm_kms_helper snd_pcm_dmaengine cec snd_pcm eeepc_wmi snd_timer rc_core asus_wmi sparse_keymap i2c_algo_bit wmi_bmof efi_pstore pcspkr fb_sys_fops snd syscopyarea cfg80211 sysfillrect k10temp sysimgblt soundcore ccp mac_hid zfs(PO)
Sep 14 22:34:40 pve2 kernel: zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32_pclmul i2c_piix4 xhci_pci xhci_pci_renesas r8169 realtek ahci xhci_hcd libahci gpio_amdpt wmi video gpio_generic
Sep 14 22:34:40 pve2 kernel: ---[ end trace fb7b29bcd76f6c8a ]---
Sep 14 22:34:40 pve2 kernel: RIP: 0010:__memcpy+0x12/0x20
Sep 14 22:34:40 pve2 kernel: Code: cc cc cc cc cc cc cc cc 48 8b 05 c1 3a 70 01 c3 cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 a4
Sep 14 22:34:40 pve2 kernel: RSP: 0018:ffffa2fc816e3d90 EFLAGS: 00010246
Sep 14 22:34:40 pve2 kernel: RAX: ffff9337982f3080 RBX: ffffa2fc816e3e30 RCX: 00000000000002ce
Sep 14 22:34:40 pve2 kernel: RDX: 0000000000000000 RSI: ffffa2fc816e4000 RDI: ffff9337982f3250
Sep 14 22:34:40 pve2 kernel: RBP: ffffa2fc816e3da0 R08: 0000000000000000 R09: ffffa2fc8c4b0000
Sep 14 22:34:40 pve2 kernel: R10: 0000000000000004 R11: ffffa2fc80000000 R12: ffffa2fc816e3e30
Sep 14 22:34:40 pve2 kernel: R13: 0000000000000000 R14: ffffa2fc816e3eb0 R15: ffff9337982f3080
Sep 14 22:34:40 pve2 kernel: FS: 00007fee8f401900(0000) GS:ffff933e3fc80000(0000) knlGS:0000000000000000
Sep 14 22:34:40 pve2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 22:34:40 pve2 kernel: CR2: ffffa2fc803c0000 CR3: 0000000111344000 CR4: 00000000003506e0
Sep 14 22:34:40 pve2 systemd: systemd-journald.service: Scheduled restart job, restart counter is at 1.
Sep 14 22:34:40 pve2 systemd: Stopping Flush Journal to Persistent Storage...
Sep 14 22:34:40 pve2 systemd: systemd-journal-flush.service: Succeeded.
Sep 14 22:34:40 pve2 systemd: Stopped Flush Journal to Persistent Storage.
Sep 14 22:34:40 pve2 systemd: Stopped Journal Service.
Sep 14 22:34:40 pve2 systemd: Starting Journal Service...
Sep 14 22:34:40 pve2 systemd-journald[3564]: Journal started
Sep 14 22:34:40 pve2 systemd-journald[3564]: System Journal (/var/log/journal/ba99f72df67c4b88a4075dcc65a21d32) is 136.0M, max 4.0G, 3.8G free.
Sep 14 22:34:40 pve2 systemd[1]: pvestatd.service: Main process exited, code=killed, status=11/SEGV
Sep 14 22:34:40 pve2 systemd[1]: pvestatd.service: Failed with result 'signal'.
Sep 14 22:34:40 pve2 systemd[1]: pvestatd.service: Consumed 6.588s CPU time.
Sep 14 22:34:40 pve2 systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
Sep 14 22:34:40 pve2 systemd[1]: systemd-journald.service: Failed with result 'signal'.
Sep 14 22:34:40 pve2 systemd[1]: Starting Flush Journal to Persistent Storage...
Sep 14 22:34:40 pve2 systemd: Started Journal Service.
Sep 14 22:34:40 pve2 systemd-journald[3564]: System Journal (/var/log/journal/ba99f72df67c4b88a4075dcc65a21d32) is 136.0M, max 4.0G, 3.8G free.
Sep 14 22:34:40 pve2 systemd[1]: Finished Flush Journal to Persistent Storage.
Sep 14 22:35:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Sep 14 22:35:01 pve2 systemd[1]: pvesr.service: Succeeded.
Sep 14 22:35:01 pve2 systemd[1]: Finished Proxmox VE replication runner.
-- Reboot --
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!