System Crash Triggered by Update

pveuser42

New Member
Dec 15, 2021
3
0
1
124
Hello,

Periodically my proxmox server will crash, and the tasks at the bottom will show that "Update package database" was started, then the system is booting and starting VM's. I am able to trigger crashes by navigating to updates for the node, and running "refresh" a few times until it crashes. The logs have no information since the system immediately crashes, and go from update started logs, to boot logs.

Any ideas on how I can troubleshoot this or investigate further? I have tried troubleshooting hardware, tweaking bios settings, following advice from a few forums regarding high io causing crashes, and even reinstalling the server. I am currently running Proxmox VE 7.1-8.

Thank you.
 
Thanks for that, it's possible that it is related. I have had this issue on proxmox v6 (latest) and now v7 (latest). I went ahead and disabled my regularly scheduled update.

I also took a look at journalctl around the time of the restart, and there are no logs that seem relevant. A few logs before and after the crash are below.

Code:
Dec 14 04:58:20 vm01 pveproxy[1752]: starting 1 worker(s)
Dec 14 04:58:20 vm01 pveproxy[1752]: worker 2430303 started
Dec 14 05:02:24 vm01 pveproxy[2415378]: worker exit
Dec 14 05:02:24 vm01 pveproxy[1752]: worker 2415378 finished
Dec 14 05:02:24 vm01 pveproxy[1752]: starting 1 worker(s)
Dec 14 05:02:24 vm01 pveproxy[1752]: worker 2433179 started
-- Boot 5a046e35f4114da1af57ddb0e26d1b25 --
Dec 14 05:10:31 vm01 kernel: Linux version 5.13.19-2-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.13
.19-4 (Mon, 29 Nov 2021 12:10:09 +0100) ()
Dec 14 05:10:31 vm01 kernel: Command line: initrd=\EFI\proxmox\5.13.19-2-pve\initrd.img-5.13.19-2-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs
Dec 14 05:10:31 vm01 kernel: KERNEL supported cpus:
Dec 14 05:10:31 vm01 kernel:   Intel GenuineIntel
Dec 14 05:10:31 vm01 kernel:   AMD AuthenticAMD
Dec 14 05:10:31 vm01 kernel:   Hygon HygonGenuine
Dec 14 05:10:31 vm01 kernel:   Centaur CentaurHauls
Dec 14 05:10:31 vm01 kernel:   zhaoxin   Shanghai 
Dec 14 05:10:31 vm01 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Dec 14 05:10:31 vm01 kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Dec 14 05:10:31 vm01 kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Dec 14 05:10:31 vm01 kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
Dec 14 05:10:31 vm01 kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
Dec 14 05:10:31 vm01 kernel: x86/fpu: xstate_offset[9]:  832, xstate_sizes[9]:    8
Dec 14 05:10:31 vm01 kernel: x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format.
Dec 14 05:10:31 vm01 kernel: BIOS-provided physical RAM map:
 
Just to update this: I may have tracked the issue down to a PSU issue. I installed a new one, and haven't been able to reproduce the issue *yet*.

Who would have thought that the update process just happens to trigger a PSU issue, and otherwise the machine runs fine... Will report back if it happens again.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!