Troubleshooting unexpected reboots

kernull

Member
Apr 11, 2022
46
4
13
I just switched out a node in a 3 node cluster thinking the reboots were hardware related...

just had an unexpected reboot 1/7 4:39AM and dont see anything obvious in the logs...

The only errors I see are postfix related and some kvm_amd vcpu0-8 errors that after a quick google sound like they can be ignored:

Code:
Jan 05 19:18:09 pveb kernel: kvm_amd: kvm [2013772]: vcpu0, guest rIP: 0xfffff8181a0fc5d1 Unhandled WRMSR(0xc0010115) = 0x0

what can I do to try to track down the cause of this reboot?

thanks for reading

edits:
- im going to look into whether a UEFI update is available....
- https://forum.proxmox.com/threads/proxmox-mystery-random-reboots.125001/post-664067 ( going to try this grub edit )

also what does it mean if "last reboot" returns multiple lines that say "still running"?
 
Last edited:
Hi,
if you still have random reboots even so the hardware itself was replaced, could it be an external issue, like unstable power delivery or power surge? Are you using an UPS?
 
Hi,
if you still have random reboots even so the hardware itself was replaced, could it be an external issue, like unstable power delivery or power surge? Are you using an UPS?
there's multiple nodes on the same source of power... I was on the system when this happened and I didnt see the lights dim or anything...

I tried the grub edit... maybe that will fix it... I havent seen it since, but then again Ive only seen this once so far since the other host was retired to a windows box a week ago.