Proxmox Server randomly crashes/restarts

AlexanderR · Oct 28, 2019

Hello Everybody,

we are facing a strange behaviour of one proxmox server on version 6.0.7: The server randomly restarts / crashes. Until now we haven't found anything special in the log files.

The crashes appear to be about 3-4 weeks apart from each other.

Is there a way to investigate the root cause of the crashes?

I would really appreciate any help and can of course provide any additional information or log files needed.

Best regards,

Alex

Chris · Oct 28, 2019

AlexanderR said:
Hello Everybody,

we are facing a strange behaviour of one proxmox server on version 6.0.7: The server randomly restarts / crashes. Until now we haven't found anything special in the log files.

The crashes appear to be about 3-4 weeks apart from each other.

Is there a way to investigate the root cause of the crashes?

I would really appreciate any help and can of course provide any additional information or log files needed.

Best regards,

Alex

HI,
as a first step you could check the RAM by running an extended memory test. How does the system crash? Are you still able to connect via ssh? Do you get a kernel dump?

AlexanderR · Oct 28, 2019

How could i get a kernel dump?

In fact the serve is not freezing... It is just restarting. I dont know if it is graceful or not because it is a remote system.

We plan to do a ram test at the upcoming weekend.

Chris · Oct 28, 2019

See here for reference https://forum.proxmox.com/threads/random-proxmox-server-hang-no-vms-no-web-gui.58823/

guletz · Oct 28, 2019

Hi,

Do you use a AMD cpu?

AlexanderR · Oct 28, 2019

guletz said:
Hi,

Do you use a AMD cpu?

no, intel dual cpu setup

ricardoj · Oct 28, 2019

Hi,

I had a similar problem on a system with AMD Ryzen 7, but it was not CPU related.

The machine is now running for 5 days without restarting.

In my case, there were several hardware issues. One was memory related and it took about 3 hours running memtest to find the first error.

The second seems to be related to one of the SATA disks.

I am still investigating and waiting at least 10 days to close this case.

You can run "memtest" and "smartctl long" just to double check

Regards,

Ricardo Jorge

hlucid · Dec 23, 2019

For a fast answer see last paragraph.

I had similar issues with my home lab server:

2 Socket Intel Xeon CPU E5645 - total 12 cores 24 threads
64GB ECC ram (4x16GB)
SuperMicro X8DTI-F
2x WD RED 1TB
TOSHIBA 4GB N300
Samsung SSD
Qualcomm Atheros AR93xx Wireless Network Adapter
Marvell 88SE9120 SATA Controller
VIA VL805 USB 3.0 Host Controller

I normally run 4-7 VMs on this machine. The main ones are:

pfSense router/firewall/access point (with pci pass-through)
FreeNAS with mirroring (with onboard Intel 82801JI SATA Controller pass through - WD Reds)
Virtualmin (CentOS) web server
Windows multipurpose vm

After trying many things I narrowed it down to power issues. It was either an unstable motherboard (didn't seem likely because of sudden onset after years of it being rock stable) or a bad PSU. I changed the PSU to a Corsair RM850x 850W. The issues went away for many months. I didn't have any issues with high loads or starting and stopping VMs after that. The issues came back all of a sudden one day executing a CPU demanding task. After that I could not start all my VMs. Starting one or two where ok, but If I tried to do a normal boot with only my main 4 VMs I would get an instant reboot. I could not bring myself to blame the recently bought PSU.

You see, what I failed to mention so far is that I am a moron

(slapping myself).

Also, that all this equipment was being powered by a slightly undersized and maybe failing EATON 5E 850VA UPS (I know - I was just being cheap). Changing the Power supply to the RM850x may have masked the UPS issue in the first place. The Power Supply I had before the RM850x still works on other computers without issues.

Removing the UPS from the equation solved everything. I hope the exposure of my stupidity makes someone else act smarter and save some time

Search

Search

Proxmox Server randomly crashes/restarts

AlexanderR

Well-Known Member

Chris

Proxmox Staff Member

AlexanderR

Well-Known Member

Chris

Proxmox Staff Member

guletz

Famous Member

AlexanderR

Well-Known Member

ricardoj

Member

hlucid

New Member