Proxmox shuts down (almost) every night

JD_EV9

New Member
Oct 24, 2024
5
1
3
Hey guys,

Almost every night my proxmox kills every process on the computer. What i mean with this is that it seems like the computer is shutting down, but the power button keeps burning and i have tot force shutdown the computer before i can boot up again.

I am using a
Dell optiplex 3060
i5 8500
32 gb ram
intel UHD Graphics 630

This is my log output before shutting down:

Oct 19 04:05:44 prxmx kernel: EXT4-fs (loop1): unmounting filesystem e9b38ee5-b13c-4423-b57b-09666e4a7b11.
Oct 19 04:05:44 prxmx pvescheduler[602473]: INFO: Finished Backup of VM 107 (00:01:22)
Oct 19 04:05:44 prxmx pvescheduler[602473]: INFO: Starting Backup of VM 112 (lxc)
Oct 19 04:06:35 prxmx pvescheduler[602473]: INFO: Finished Backup of VM 112 (00:00:51)
Oct 19 04:06:35 prxmx pvescheduler[602473]: INFO: Backup job finished successfully
Oct 19 04:17:01 prxmx CRON[616304]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 04:17:01 prxmx CRON[616305]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 04:17:01 prxmx CRON[616304]: pam_unix(cron:session): session closed for user root
Oct 19 04:28:56 prxmx smartd[649]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 59 to 63
Oct 19 05:17:01 prxmx CRON[633866]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 05:17:01 prxmx CRON[633867]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 05:17:01 prxmx CRON[633866]: pam_unix(cron:session): session closed for user root
Oct 19 05:58:56 prxmx smartd[649]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 104 to 105
Oct 19 06:17:01 prxmx CRON[651507]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 06:17:01 prxmx CRON[651508]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 06:17:01 prxmx CRON[651507]: pam_unix(cron:session): session closed for user root
Oct 19 06:22:01 prxmx systemd[1]: Starting apt-daily-upgrade.service - Daily apt upgrade and clean activities...
Oct 19 06:22:01 prxmx systemd[1]: apt-daily-upgrade.service: Deactivated successfully.
Oct 19 06:22:01 prxmx systemd[1]: Finished apt-daily-upgrade.service - Daily apt upgrade and clean activities.
Oct 19 06:25:01 prxmx CRON[653931]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 06:25:01 prxmx CRON[653932]: (root) CMD (test -x /usr/sbin/anacron || { cd / && run-parts --report /etc/cron.daily; })
Oct 19 06:25:01 prxmx CRON[653931]: pam_unix(cron:session): session closed for user root
Oct 19 07:17:01 prxmx CRON[669190]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 07:17:01 prxmx CRON[669191]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 07:17:01 prxmx CRON[669190]: pam_unix(cron:session): session closed for user root
Oct 19 08:17:01 prxmx CRON[686776]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 08:17:01 prxmx CRON[686777]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 08:17:01 prxmx CRON[686776]: pam_unix(cron:session): session closed for user root
Oct 19 08:28:56 prxmx smartd[649]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 63 to 62
Oct 19 09:17:01 prxmx CRON[704360]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 09:17:01 prxmx CRON[704361]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 09:17:01 prxmx CRON[704360]: pam_unix(cron:session): session closed for user root
Oct 19 09:28:56 prxmx smartd[649]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 62 to 63
Oct 19 09:40:15 prxmx systemd[1]: Starting man-db.service - Daily man-db regeneration...
Oct 19 09:40:15 prxmx systemd[1]: man-db.service: Deactivated successfully.
Oct 19 09:40:15 prxmx systemd[1]: Finished man-db.service - Daily man-db regeneration.
Oct 19 10:17:01 prxmx CRON[721889]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 10:17:01 prxmx CRON[721890]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 10:17:01 prxmx CRON[721889]: pam_unix(cron:session): session closed for user root
Oct 19 11:17:01 prxmx CRON[739422]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 19 11:17:01 prxmx CRON[739423]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 19 11:17:01 prxmx CRON[739422]: pam_unix(cron:session): session closed for user root
-- Reboot --
 
Oct 19 05:58:56 prxmx smartd[649]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 104 to 105
105 °C ??
Check your cooling.
With such temperatures the BIOS might shut down something, if it monitors the disk temps. Or other parts of the mainboard are too hot.
 
  • Like
Reactions: Johannes S
Thanks for your reaction!
i thought this was a strange value as well, but it is printing the normalised value instead of the actual value.
See smart values:
 

Attachments

  • smart.PNG
    smart.PNG
    29 KB · Views: 7
  • Like
Reactions: UdoB
Looking at the log, the shutdown is indistinguishable from a (wall) power failure and give no clue. Overclocking or a weak/old PSU can cause a power-dip and freeze like you describe. Make sure the BIOS settings are conservative/defaults and maybe try a different/newer PSU?
 
@leesteken , thanks for your response.
i could look into that.
I am gonna swap the hdd for a ssd one of these day to rule out the system isn't getting to hot and upgrade the BIOS version (checked and saw it wasn't up-to-date).

The weird thing is i had this issue before, but nog for a few months. And now its is crashing almost every night.
 
Yeah good question, i didn't do anything.
I will try if a other power group makes any changes, and maybe go looking for a newer psu.
 
Update:
Eventually I found some logging which I reoccuring around the time of the crashes:
Oct 30 07:34:31 prxmx kernel: x86/cpu: SGX disabled by BIOS.

So i enabled this option (instead of software defined) in the BIOS and for now I will have to wait if it was the solution.
 
Oct 30 07:34:31 prxmx kernel: x86/cpu: SGX disabled by BIOS.

So i enabled this option (instead of software defined) in the BIOS and for now I will have to wait if it was the solution.
SGX has lots of vulnerabilities and might be best left disabled. I cannot imagine that it's related to the crashes or will change anything, but please let us know if it helps.
I would test with less and/or other RAM, a different PSU, a different motherboard and a different CPU in that order. Do you see anything on the display just before it crashes (and might not be in the logs)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!