Proxmox reboot every 2 weeks

GCadmin

New Member
Jun 21, 2023
5
0
1
Hi,
I have a specific problem with an unwanted proxmox reboot. It happens every 2 weeks +/- an hour. Here are 3 cases

Code:
Nov 11 07:17:01 gc CRON[3021777]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 11 07:17:01 gc CRON[3021778]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 11 07:17:01 gc CRON[3021777]: pam_unix(cron:session): session closed for user root
Nov 11 07:35:32 gc pvestatd[5411]: auth key pair too old, rotating..
Nov 11 08:17:01 gc CRON[3203650]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 11 08:17:01 gc CRON[3203651]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 11 08:17:01 gc CRON[3203650]: pam_unix(cron:session): session closed for user root
Nov 11 09:17:01 gc CRON[3370198]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 11 09:17:01 gc CRON[3370199]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 11 09:17:01 gc CRON[3370198]: pam_unix(cron:session): session closed for user root
Nov 11 10:17:01 gc CRON[3537087]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 11 10:17:01 gc CRON[3537088]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 11 10:17:01 gc CRON[3537087]: pam_unix(cron:session): session closed for user root
Nov 11 11:17:01 gc CRON[3719097]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 11 11:17:01 gc CRON[3719098]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 11 11:17:01 gc CRON[3719097]: pam_unix(cron:session): session closed for user root
Nov 11 11:49:34 gc systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Nov 11 11:49:34 gc systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Nov 11 11:49:34 gc systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Nov 11 11:49:34 gc systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
-- Reboot --



Oct 28 09:17:01 gc CRON[46967]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 28 09:17:01 gc CRON[46968]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 28 09:17:01 gc CRON[46967]: pam_unix(cron:session): session closed for user root
Oct 28 10:17:01 gc CRON[232203]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 28 10:17:01 gc CRON[232204]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 28 10:17:01 gc CRON[232203]: pam_unix(cron:session): session closed for user root
Oct 28 11:17:01 gc CRON[444916]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 28 11:17:01 gc CRON[444917]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 28 11:17:01 gc CRON[444916]: pam_unix(cron:session): session closed for user root
Oct 28 12:04:19 gc systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Oct 28 12:04:19 gc systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Oct 28 12:04:19 gc systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Oct 28 12:04:19 gc systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Oct 28 12:17:01 gc CRON[631354]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 28 12:17:01 gc CRON[631355]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 28 12:17:01 gc CRON[631354]: pam_unix(cron:session): session closed for user root
-- Reboot --


Oct 14 11:00:00 gc postfix/qmgr[5207]: 738891B589: from=<>, size=2620, nrcpt=1 (queue active)
Oct 14 11:00:30 gc postfix/smtp[4160248]: connect to _dc-mx.fdc37e72f63a.gckosice.sk[45.13.137.7]:25: Connection timed out
Oct 14 11:01:00 gc postfix/smtp[4160248]: connect to _dc-mx.4622b4ac8a95.gckosice.sk[45.13.137.7]:25: Connection timed out
Oct 14 11:01:30 gc postfix/smtp[4160248]: connect to _dc-mx.4622b4ac8a95.gckosice.sk[45.13.137.8]:25: Connection timed out
Oct 14 11:01:30 gc postfix/smtp[4160248]: 738891B589: to=<caco@gckosice.sk>, relay=none, delay=200143, delays=200053/0.01/90/0, dsn=4.4.1, status=deferred (connect to _dc-mx.4622b4ac8a95.gckosice.sk[45.13.137.8]:25: Connection timed out)
Oct 14 11:10:00 gc postfix/qmgr[5207]: 02FAA1BA1C: from=<>, size=4878, nrcpt=1 (queue active)
Oct 14 11:10:30 gc postfix/smtp[4186937]: connect to _dc-mx.fdc37e72f63a.gckosice.sk[45.13.137.7]:25: Connection timed out
Oct 14 11:11:00 gc postfix/smtp[4186937]: connect to _dc-mx.4622b4ac8a95.gckosice.sk[45.13.137.8]:25: Connection timed out
Oct 14 11:11:00 gc postfix/smtp[4186937]: 02FAA1BA1C: to=<caco@gckosice.sk>, relay=none, delay=103505, delays=103445/0.01/60/0, dsn=4.4.1, status=deferred (connect to _dc-mx.4622b4ac8a95.gckosice.sk[45.13.137.8]:25: Connection timed out)
Oct 14 11:14:39 gc systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Oct 14 11:14:39 gc systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Oct 14 11:14:39 gc systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Oct 14 11:14:39 gc systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Oct 14 11:17:01 gc CRON[11467]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 14 11:17:01 gc CRON[11468]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 14 11:17:01 gc CRON[11467]: pam_unix(cron:session): session closed for user root
-- Reboot --
 
There are no causes in the logs. It's indistinguishable from someone flipping the breaker or otherwise interrupting the power.
Does the system start automatically again? Is it set to start automatically on power restore?
What else happens every fortnight around noon? Do the cleaners unplug the system to power the vacuum, maybe?
Sometimes a clue or error message is not flushed to disk but is shown on a physical display. Can you connect a display and look at it next time?
 
There are no causes in the logs. It's indistinguishable from someone flipping the breaker or otherwise interrupting the power.
Does the system start automatically again? Is it set to start automatically on power restore?
What else happens every fortnight around noon? Do the cleaners unplug the system to power the vacuum, maybe?
Sometimes a clue or error message is not flushed to disk but is shown on a physical display. Can you connect a display and look at it next time?
thanks for the advice, I'll try to find out about the outages with the building manager and also change the batteries on the UPS. I reserve time to be present and see what is shown on the display at that time and throw it here
 
thanks for the advice, I'll try to find out about the outages with the building manager and also change the batteries on the UPS. I reserve time to be present and see what is shown on the display at that time and throw it here
Might be worth checking the UPS. If it is faulty, it could be the cause. Since it's so regular, having a look at various displays and LED-indicators next time might help a lot.
The "reboot" in the logs does not mean that it was a reboot, it could be power loss and a restart later. Do the logs show much time between the last log line and the first one of the next start (which you unfortunately did not show)? Is the system is set to start automatically when power comes back on? Did you find the system powered off and did you (or someone else) power it on manually?
 
Yes, I will look at it and also at the UPS display. The big advantage is that server and Proxmox starts properly, it also starts all the VMs, but it does not stop them at all before the restart, so a power failure makes the most sense.
 
Yes, I will look at it and also at the UPS display. The big advantage is that server and Proxmox starts properly, it also starts all the VMs, but it does not stop them at all before the restart, so a power failure makes the most sense.
It depends on the answers to the questions that you have not answered:
Does the system start automatically again? Is it set to start automatically on power restore?
Do the logs show much time between the last log line and the first one of the next start (which you unfortunately did not show)? Is the system is set to start automatically when power comes back on? Did you find the system powered off and did you (or someone else) power it on manually?
Is the system is set to start automatically when power comes back on?
Did the system start automatically again, or was there some time between the unexpected power off and a manual restart?
 
Yes, the system starts up by itself, even after a blackout and the time between restart and startup is sometimes 3, sometimes 7 and sometimes up to 15 minutes
 
Yes, the system starts up by itself, even after a blackout and the time between restart and startup is sometimes 3, sometimes 7 and sometimes up to 15 minutes
Just to be clear: the system restarted by itself three times? And is the motherboard BIOS set to automatically start after power comes back? If not, then maybe it's not a power failure but a CPU reset issue (like an internal power/PSU issue or memory corruption).
 
Yes it rebooted 3 times but over the course of 7 weeks. As I wrote in the introduction, almost exactly every 2 weeks. The settings of the motherboard and CPU should be correct, it is according to the manufacturer and the server itself is 3 months old. So I will try to find out if there were any power outages and check the condition of the batteries
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!