Proxmox Server randomly crashes/restarts

AlexanderR

Well-Known Member
Jan 19, 2019
31
8
48
30
Hello Everybody,

we are facing a strange behaviour of one proxmox server on version 6.0.7: The server randomly restarts / crashes. Until now we haven't found anything special in the log files.

The crashes appear to be about 3-4 weeks apart from each other.

Is there a way to investigate the root cause of the crashes?

I would really appreciate any help and can of course provide any additional information or log files needed.


Best regards,

Alex
 
Hello Everybody,

we are facing a strange behaviour of one proxmox server on version 6.0.7: The server randomly restarts / crashes. Until now we haven't found anything special in the log files.

The crashes appear to be about 3-4 weeks apart from each other.

Is there a way to investigate the root cause of the crashes?

I would really appreciate any help and can of course provide any additional information or log files needed.


Best regards,

Alex
HI,
as a first step you could check the RAM by running an extended memory test. How does the system crash? Are you still able to connect via ssh? Do you get a kernel dump?
 
How could i get a kernel dump?

In fact the serve is not freezing... It is just restarting. I dont know if it is graceful or not because it is a remote system.

We plan to do a ram test at the upcoming weekend.
 
Hi,

I had a similar problem on a system with AMD Ryzen 7, but it was not CPU related.

The machine is now running for 5 days without restarting.

In my case, there were several hardware issues. One was memory related and it took about 3 hours running memtest to find the first error.

The second seems to be related to one of the SATA disks.

I am still investigating and waiting at least 10 days to close this case.

You can run "memtest" and "smartctl long" just to double check

Regards,

Ricardo Jorge
 
For a fast answer see last paragraph.

I had similar issues with my home lab server:

2 Socket Intel Xeon CPU E5645 - total 12 cores 24 threads
64GB ECC ram (4x16GB)
SuperMicro X8DTI-F
2x WD RED 1TB
TOSHIBA 4GB N300
Samsung SSD
Qualcomm Atheros AR93xx Wireless Network Adapter
Marvell 88SE9120 SATA Controller
VIA VL805 USB 3.0 Host Controller

I normally run 4-7 VMs on this machine. The main ones are:

pfSense router/firewall/access point (with pci pass-through)
FreeNAS with mirroring (with onboard Intel 82801JI SATA Controller pass through - WD Reds)
Virtualmin (CentOS) web server
Windows multipurpose vm

After trying many things I narrowed it down to power issues. It was either an unstable motherboard (didn't seem likely because of sudden onset after years of it being rock stable) or a bad PSU. I changed the PSU to a Corsair RM850x 850W. The issues went away for many months. I didn't have any issues with high loads or starting and stopping VMs after that. The issues came back all of a sudden one day executing a CPU demanding task. After that I could not start all my VMs. Starting one or two where ok, but If I tried to do a normal boot with only my main 4 VMs I would get an instant reboot. I could not bring myself to blame the recently bought PSU.

You see, what I failed to mention so far is that I am a moron :p (slapping myself).

Also, that all this equipment was being powered by a slightly undersized and maybe failing EATON 5E 850VA UPS (I know - I was just being cheap). Changing the Power supply to the RM850x may have masked the UPS issue in the first place. The Power Supply I had before the RM850x still works on other computers without issues.

Removing the UPS from the equation solved everything. I hope the exposure of my stupidity makes someone else act smarter and save some time :)
 
  • Like
Reactions: guletz

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!