Previously i was running i7-3770 and it was working 24/7 for 365 days a year. No issues with uptime unless i decided to update or reboot it. As i use this host for my-lab maximum memory of 32GB was not enough for me. I upgraded hardware to Ryzen 5 PRO 4650G with 128 GB ECC (4x32 GB Kingston). Should be even more stable... but not .
System can work for a week or even more (20 days streak) but then suddenly stops. On "average" it works for 5 days .
How it looks:
By the stop/freeze it looks like that:
System HDD lets stops blinking (no I/O activity)
Server still responds to ping.
Any service/VM stops responding
I cannot login via ssh.
From terminal keyboard is working (Caps Lock light). I can type username but i don't get to password prompt. I can switch between terminals.
And nothing more. Maybe i can look somewhere else.
What have i tried:
Even this is ecc ram i tested ram using memtest86 - no issues.
I swapped motherboards (both has latest BIOS and both failed same way):
Hdd temps are below 50 celsius, also installed latest updates for proxmox but still same freezes. More about config below.
System can work for a week or even more (20 days streak) but then suddenly stops. On "average" it works for 5 days .
How it looks:
By the stop/freeze it looks like that:
System HDD lets stops blinking (no I/O activity)
Server still responds to ping.
Any service/VM stops responding
I cannot login via ssh.
From terminal keyboard is working (Caps Lock light). I can type username but i don't get to password prompt. I can switch between terminals.
journalctl -b -1 -xe
Code:
Sep 16 05:24:56 ryzen-vtn-proxmox pvestatd[2099]: status update time (13.866 seconds)
Sep 16 05:25:01 ryzen-vtn-proxmox CRON[3375803]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 16 05:25:01 ryzen-vtn-proxmox CRON[3375804]: (root) CMD (for i in `lsblk | grep disk |grep -v "230" | awk {'print $1>
Sep 16 05:25:01 ryzen-vtn-proxmox CRON[3375803]: pam_unix(cron:session): session closed for user root
Sep 16 05:25:05 ryzen-vtn-proxmox pvestatd[2099]: status update time (8.859 seconds)
Sep 16 05:25:13 ryzen-vtn-proxmox pvestatd[2099]: status update time (7.792 seconds)
Sep 16 05:25:24 ryzen-vtn-proxmox dockerd[1967]: time="2024-09-16T05:25:24.656926330+03:00" level=error msg="[resolver] >
Sep 16 05:25:24 ryzen-vtn-proxmox dockerd[1967]: time="2024-09-16T05:25:24.656929286+03:00" level=error msg="[resolver] >
Sep 16 05:25:29 ryzen-vtn-proxmox pvestatd[2099]: status update time (12.891 seconds)
Sep 16 05:25:36 ryzen-vtn-proxmox pvestatd[2099]: status update time (6.647 seconds)
Sep 16 05:25:55 ryzen-vtn-proxmox pvestatd[2099]: status update time (15.784 seconds)
Sep 16 05:26:07 ryzen-vtn-proxmox pvestatd[2099]: status update time (12.479 seconds)
Sep 16 05:26:24 ryzen-vtn-proxmox pvestatd[2099]: status update time (17.012 seconds)
Sep 16 05:26:32 ryzen-vtn-proxmox pvestatd[2099]: status update time (8.215 seconds)
Sep 16 05:26:50 ryzen-vtn-proxmox pvestatd[2099]: status update time (15.544 seconds)
Sep 16 05:27:01 ryzen-vtn-proxmox pvestatd[2099]: status update time (10.782 seconds)
Sep 16 05:27:23 ryzen-vtn-proxmox pvestatd[2099]: status update time (11.602 seconds)
Sep 16 05:27:31 ryzen-vtn-proxmox pvestatd[2099]: status update time (8.365 seconds)
Sep 16 05:27:50 ryzen-vtn-proxmox pvestatd[2099]: status update time (7.394 seconds)
Sep 16 05:28:01 ryzen-vtn-proxmox pvestatd[2099]: status update time (7.872 seconds)
lines 1059-1107/1107 (END)
And nothing more. Maybe i can look somewhere else.
What have i tried:
Even this is ecc ram i tested ram using memtest86 - no issues.
I swapped motherboards (both has latest BIOS and both failed same way):
- ASUS PRIME B450M-A II AMD B450
- ASUS PRO B550M-C/CSM AMD B550
- Ryzen 5 PRO 4650G -> Ryzen 5 2600 + GeForce GT 710
Hdd temps are below 50 celsius, also installed latest updates for proxmox but still same freezes. More about config below.
Last edited: