Spontaneous reboots on Minisforum MS-A2 with 6.17 (and later 6.14)

VivienM

Member
Jun 8, 2023
18
0
6
Hi,

This is a weird one. I have a Minisforum MS-A2, Ryzen 9955HX, 128GB of RAM, a Samsung SSD. Running up to kernels 6.14.8-2, it is rock solid. So I don't think it's a hardware issue...

Newer kernels, certainly including all the 6.17s I've tried including now 6.17.9-1 but I believe also including some newer 6.14s, cause spontaneous reboots within 24 hours.

I had "solved" this before Christmas by just going back to 6.14.8-2, but had a little power mishap yesterday, it booted back up to 6.17.9-1, and... less than 24 hours later, spontaneous reboot.

In the dmesg output, I note the following:
[ 0.892726] x86/amd: Previous system reset reason [0x00300800]: software wrote 0xE to reset control register 0xC
F9
[ 0.892728] x86/amd: Previous system reset reason [0x00300800]: ACPI power state transition occurred
I poked around journalctl, I'm not seeing any log entries that are particularly pertinent...

Happy to provide any further logs, etc.
 
Googling "software wrote 0xE to reset control register 0xC" leads to some interesting info.
I didn't find that much, but I did discover that that message is cut off. Should be "software wrote 0xE to reset control register 0xCF9"

When you google that, yes, it starts to get more interesting, but most of what I'm finding so far is about instability issues with older Zen chips back in 2017-18...
 
The stuff about setting a slightly higher voltage and/or lower frequency in BIOS seems relevant though. It might also pay to look at what c-states are enabled and whether you have the AMD microcode installed.

Other than that I got nuthin'.
 
The stuff about setting a slightly higher voltage and/or lower frequency in BIOS seems relevant though. It might also pay to look at what c-states are enabled and whether you have the AMD microcode installed.

Other than that I got nuthin'.
AMD microcode is installed.

I guess I can find a keyboard/monitor to go poke at the BIOS, but if those things are set wrong, why doesn't 6.14.8-2 have a problem with it?

Found something else while googling, someone having similar issues in ArchLinux that seemed to have to do with kernels being compiled with GCC 15.2. I wonder what GCC is used to compile which proxmox kernels...