Proxmox Node crashes randomly

r0g

Well-Known Member
May 31, 2016
32
0
46
28
Hi,

in the last two months my server has been crashed randomly. However, I have not been able to determine the cause by reading the log files. Can anyone have a look at these logfiles and help me?

I've already contacted my hoster and he did a complete hardware test that took about 13 hours, but they didn't find anything.
 

Attachments

Can you post your syslog between the time the server crashes?
Yes. See attached file. Server crashed at 12:31 and at 12:50 i rebooted the system manually.

For me it looks like the server restarted normally at 12:31, doesn't it? But unfortunately I have no idea why... Any ideas?
 

Attachments

Last edited:
Strange... didn't see anything helping on the fast. Server up do date? Maybe an bad hardwareissue. Was something changed on the server?
 
Some days ago i created a new VM and restarted the node after 48 Days uptime and now - 1 week later - the node crashed...

My Version:

Kernel Version
Linux 4.4.83-1-pve #1 SMP PVE 4.4.83-96 (Tue, 19 Sep 2017 10:30:12 +0200)

PVE Manager Version
pve-manager/4.4-18/ef2610e8

Do you think replacing the whole hardware except HDDs will help?
 
Do you think replacing the whole hardware except HDDs will help?
This is really hard to say... What HW do you have? CPU/MEM/HDD/SSD/RAID
What does
Code:
pveperf /your/storagepath/
say?
 
This is really hard to say... What HW do you have? CPU/MEM/HDD/SSD/RAID
What does
Code:
pveperf /your/storagepath/
say?

Hardware:
CPU: Intel Core i7-3770
2x HDD 3,0 TB SATA Enterprise
4x RAM 8192 MB DDR3
LSI MegaRAID SAS 9260-4i

root@node01 /var/lib/vz # pveperf /var/lib/vz
CPU BOGOMIPS: 54405.68
REGEX/SECOND: 1837641
HD SIZE: 2677.19 GB (/dev/mapper/vg0-data)
BUFFERED READS: 158.14 MB/sec
AVERAGE SEEK TIME: 11.06 ms
FSYNCS/SECOND: 5636.60
DNS EXT: 18.24 ms
DNS INT: 14.69 ms (mydomain.co)
 
Ok. Your hardware looks fine. So it looks for me an selbbuild server, so no support. What can you do? You can do an Firmware/BIOS/Controller... update do the latest version of it, reset the BIOS to defaults, and reconfigure it new with your settings. Deactivate alle powersaveoptions in the bios.

Later an Update to 5.1. But it looks for me an HW-Problem...and/or... we will see.
 
In fact, the BIOS version is very outdated. I will ask my hoster to update the BIOS version and then continue to monitor the stability. Thank you very much!