All VMs on node shutdown overnight

jdepa

New Member
Sep 28, 2020
7
0
1
39
Hello all!
For some reason, all of the VMs on one of my nodes are shutting down overnight. Or atleast, when I'm not looking.
I can start them up and do some work for many hours and when I return about 12 hours later I find that all of the VMs have shutdown.

Is there somewhere I can look for what might be calling the shutdown? The tasks log show a bunch of "Status: stopped: OK" with task type "vncproxy". I have one of these for each VM that was previously running. The final task log is for "Start all VMs and Containers" with Status: stopped: OK and task type "startall".

I've had many of these VMs on this node for over a year with no problems. The only recent change is that I spun up an Ubuntu18 server with iRedMail.

Any thoughts of where to find a reason for the shutdown would be very much appreciated!

Proxmox Virtual Environment 5.4-15
 
Hi,

Any thoughts of where to find a reason for the shutdown would be very much appreciated!
Have you seen any errors in syslog or journalctl?

Also have you enough memory for your VMs? because if you don't have enough RAM it might be OOM!!
 
  • Like
Reactions: jdepa
Aha, I knew someone would say to upgrade! I somewhat recently upgraded to 5.4 to begin with. (Few months ago) But I will upgrade to 6.x eventually.

Scrolling through syslog and journalctl (and getting past the 100s of "Started Proxmox VE replication runner") I found this:

Dec 9 01:00:01 hm CRON[30286]: (root) CMD (#vzdump 100 101 103 104 105 106 107 108 109 111 113 --compress gzip --mode stop --mailnotification failure --storage local --node hm --quiet 1)

I must have something somewhere telling the VMs to backup at 1am. I checked, and sure enough I had something in /etc/cron.d/vzdump to kick off a back up with mode as stop. I thought I had checked all the cron locations, guess I missed the main one.

0 1 * * 3 root #vzdump 100 101 103 104 105 106 107 108 109 111 113 --compress gzip --mode stop --mailnotification failure --storage local --node hm --quiet 1

Thanks for asking for me to check the logs. I had checked journalctl but it kept being wiped out because of all the data. Ended up going through /var/log/syslog.1 to find this.
 
The question is why aren't the VMs starting after that backup? I'm also running backups in stop mode, because that is the safest method, but the VMs are started afterwards again.
 
So turns out that was merely a coincidence. I was actually working on some of the VMs when the whole node shutdown.

I haven't yet seen anything in the logs to indicate why this occurred.

So, any thoughts on what might have caused the proxmox service to restart?
 
After a week of keeping close eye I've realized that it must have been from resource starvation. Only seems to happen when I have all of the VMs running.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!