PVE crashing -

cotton

New Member
Jun 3, 2024
2
0
1
Anyone have an idea of what may be going on here? I'm pretty new to PVE and also Linux, I'm getting into starting up my own plex server and for now I was just going to transfer the files over from my local PC to my Ubuntu server VM but after it transfers for around 15~30 seconds it crashes not only the vm but the Proxmox itself. After looking at logs this is what I can find that happens right before the crash. It seems like my CPU may be bad from what I'm reading but like I said I'm not 100% sure due to my inexperience. Any help would be appreciated!
I don't believe it is my ram, I ran a memtest and everything passed.
If there's any other information I could provide to help diagnose this issue please let me know!

Jun 02 22:37:01 home kernel: mce: Uncorrected hardware memory error in user-access at 82b8d5a40
Jun 02 22:37:01 home kernel: mce: [Hardware Error]: Machine check events logged
Jun 02 22:37:01 home kernel: [Hardware Error]: Uncorrected, software restartable error.
Jun 02 22:37:01 home kernel: [Hardware Error]: CPU:17 (19:21:2) MC0_STATUS[-|UE|MiscV|AddrV|-|-|-|-|Poison|-]: 0xbc00080001010135
Jun 02 22:37:01 home kernel: [Hardware Error]: Error Addr: 0x000000082b8d5a40
Jun 02 22:37:01 home kernel: [Hardware Error]: IPID: 0x001000b000000000
Jun 02 22:37:01 home kernel: [Hardware Error]: Load Store Unit Ext. Error Code: 1
Jun 02 22:37:01 home kernel: [Hardware Error]: cache level: L1, tx: DATA, mem-tx: DRD

PC information


description: Desktop Computer
product: B550I AORUS PRO AX (Default string)
vendor: Gigabyte Technology Co., Ltd.
version: Default string
serial: Default string
width: 64 bits
capabilities: smbios-3.3.0 dmi-3.3.0 smp vsyscall32
configuration: boot=normal chassis=desktop family=B550 MB sku=Default string uuid=03560274-043c-05cb-2a06-240700080009
*-core
description: Motherboard
product: B550I AORUS PRO AX
vendor: Gigabyte Technology Co., Ltd.
physical id: 0
version: x.x
serial: Default string
slot: Default string

*-memory
description: System Memory
physical id: b
slot: System board or motherboard
size: 64GiB
capabilities: ecc
configuration: errordetection=multi-bit-ecc
*-bank:1
description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2666 MHz (0.4 ns)
product: W724GU44J9266NA
vendor: Unknown
physical id: 1
serial: 00000000
slot: DIMM 1
size: 32GiB
width: 64 bits
clock: 2666MHz (0.4ns)
*-bank:3
description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2666 MHz (0.4 ns)
product: W724GU44J9266NA
vendor: Unknown
physical id: 3
serial: 00000000
slot: DIMM 1
size: 32GiB
width: 64 bits
clock: 2666MHz (0.4ns)

*-cpu
description: CPU
product: AMD Ryzen 9 5900X 12-Core Processor
vendor: Advanced Micro Devices [AMD]
physical id: 11
bus info: cpu@0
version: 25.33.2
serial: Unknown
slot: AM4
size: 3700MHz
capacity: 4950MHz
width: 64 bits
clock: 100MHz
capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp x86-64 constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap cpufreq
configuration: cores=12 enabledcores=12 microcode=169873930 threads=24
 
With "Uncorrected hardware memory error" being thrown can you set up a screen and keyboard connected directly to the system?

As part of the boot process, you should get a grub screen that defaults to loading Proxmox, but another option is a memory tester using the down key you can select this and let it run (for a long time).

This will hammer the memory and the CPU memory controller while excluding the Proxmox code base. If you are lucky it will also report memory issues so you can then try to reseat the current memory and/or install different modules and try again.
 
With "Uncorrected hardware memory error" being thrown can you set up a screen and keyboard connected directly to the system?

As part of the boot process, you should get a grub screen that defaults to loading Proxmox, but another option is a memory tester using the down key you can select this and let it run (for a long time).

This will hammer the memory and the CPU memory controller while excluding the Proxmox code base. If you are lucky it will also report memory issues so you can then try to reseat the current memory and/or install different modules and try again.
I did run a memtest on the machine around a week ago and it came back as passed with no errors. I have heard about something along the lines of transferring files not releasing the cached memory on the vm but I am unsure of how to test for this.
 
I do not know enough about the overall Proxmox environment to comment on alternative causes, but the error is a kernel-level error regarding a Machine Check Exception (MCE) which is hardware-related. Your system dump indicates that you are using ECC memory so it is possible for the CPU to spot ECC errors.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!