Random System Freeze

unholyhumorousunratedoat

Well-Known Member
Sep 18, 2019
46
10
48
29
Hi guys,
my system does freeze randomly since a while and I have no idea why. Sometimes after 2 hours, sometimes after a few day. Then, I can't ping the PVE Host IP and the power consumption freezes as well, e. g. to 28,1W.

I'd like to tell you what I tested:

- GA-AB350M-DS3H has been repaired in July 2019 because of the same random system freeze. I'm using the most current BIOS: F50d
- 4x 8GB G.Skill Aegis DIMM run MemTestX86 with an error, but when testing all modules by themselves to errors (running every test 4 times)
- Ryzen 5 1600
- Cougar A400 PSU has been temporarily replaced by a BeQuiet PSU, but freezes occured.
- Logs don't say anything :(

journalctl
Code:
May 24 04:20:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:20:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:21:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:21:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:21:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:22:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:22:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:22:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:23:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:23:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:23:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:24:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:24:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:24:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:25:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:25:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:25:01 pve1 systemd[1]: Started Proxmox VE replication runner.
May 24 04:26:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 24 04:26:01 pve1 systemd[1]: pvesr.service: Succeeded.
May 24 04:26:01 pve1 systemd[1]: Started Proxmox VE replication runner.
-- Reboot --
May 24 07:34:10 pve1 kernel: Linux version 5.4.114-1-pve (build@proxmox) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP PVE 5.4.114-1 (Sun, 09 May 2021 17:13:05 +0200) ()
May 24 07:34:10 pve1 kernel: Command line: BOOT_IMAGE=/vmlinuz-5.4.114-1-pve root=/dev/mapper/pve1--vg-root ro quiet
May 24 07:34:10 pve1 kernel: KERNEL supported cpus:
May 24 07:34:10 pve1 kernel:   Intel GenuineIntel
May 24 07:34:10 pve1 kernel:   AMD AuthenticAMD
May 24 07:34:10 pve1 kernel:   Hygon HygonGenuine
May 24 07:34:10 pve1 kernel:   Centaur CentaurHauls
May 24 07:34:10 pve1 kernel:   zhaoxin   Shanghai
May 24 07:34:10 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
May 24 07:34:10 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
May 24 07:34:10 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
May 24 07:34:10 pve1 kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
May 24 07:34:10 pve1 kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.

Any ideas how and what I should test?
Thank you guys :)
 
Here you can see my BIOS configuration.
Power consumption raised to 38W in idle^^
Sorry I can't see, but an increase in power consumption by the work-arounds is not unexpected. The original zen1 chips are known to go into too low a power and freeze or reset. Maybe a BIOS update will help? Maybe one of the other work-around can save you some power? If this helps, at least you know what to search for.
 
Pro-Move: forget the attachment :D
Now you should see it. It is the most current BIOS - current uptime 13 hours. I'll let you know :)
 

Attachments

  • 20210524_195116.jpg
    20210524_195116.jpg
    342.2 KB · Views: 32