Found out this morning that pve had rebooted overnight

chevybeef

Member
Nov 10, 2020
5
0
21
59
When looking at the logs I found this:

Code:
Nov 13 00:00:29 a520mk systemd[1]: Starting dpkg-db-backup.service - Daily dpkg database backup service...
Nov 13 00:00:29 a520mk systemd[1]: Starting logrotate.service - Rotate log files...
Nov 13 00:00:29 a520mk systemd[1]: Reloading pveproxy.service - PVE API Proxy Server...
Nov 13 00:00:29 a520mk systemd[1]: dpkg-db-backup.service: Deactivated successfully.
Nov 13 00:00:29 a520mk systemd[1]: Finished dpkg-db-backup.service - Daily dpkg database backup service.
Nov 13 00:00:30 a520mk pveproxy[193253]: send HUP to 1433
Nov 13 00:00:30 a520mk pveproxy[1433]: received signal HUP
Nov 13 00:00:30 a520mk pveproxy[1433]: server closing
Nov 13 00:00:30 a520mk pveproxy[1433]: server shutdown (restart)
Nov 13 00:00:30 a520mk systemd[1]: Reloaded pveproxy.service - PVE API Proxy Server.
Nov 13 00:00:30 a520mk systemd[1]: Reloading spiceproxy.service - PVE SPICE Proxy Server...
Nov 13 00:00:30 a520mk spiceproxy[193287]: send HUP to 1439
Nov 13 00:00:30 a520mk spiceproxy[1439]: received signal HUP
Nov 13 00:00:30 a520mk spiceproxy[1439]: server closing
Nov 13 00:00:30 a520mk spiceproxy[1439]: server shutdown (restart)
Nov 13 00:00:30 a520mk systemd[1]: Reloaded spiceproxy.service - PVE SPICE Proxy Server.
Nov 13 00:00:30 a520mk pvefw-logger[1013]: received terminate request (signal)
Nov 13 00:00:30 a520mk pvefw-logger[1013]: stopping pvefw logger
Nov 13 00:00:30 a520mk systemd[1]: Stopping pvefw-logger.service - Proxmox VE firewall logger...
Nov 13 00:00:30 a520mk spiceproxy[1439]: restarting server
Nov 13 00:00:30 a520mk spiceproxy[1439]: starting 1 worker(s)
Nov 13 00:00:30 a520mk spiceproxy[1439]: worker 193312 started
Nov 13 00:00:30 a520mk systemd[1]: pvefw-logger.service: Deactivated successfully.
Nov 13 00:00:30 a520mk systemd[1]: Stopped pvefw-logger.service - Proxmox VE firewall logger.
Nov 13 00:00:30 a520mk systemd[1]: pvefw-logger.service: Consumed 4.201s CPU time.
Nov 13 00:00:30 a520mk systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Nov 13 00:00:30 a520mk pvefw-logger[193316]: starting pvefw logger
Nov 13 00:00:30 a520mk systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
Nov 13 00:00:30 a520mk systemd[1]: logrotate.service: Deactivated successfully.
Nov 13 00:00:30 a520mk systemd[1]: Finished logrotate.service - Rotate log files.
Nov 13 00:00:30 a520mk pveproxy[1433]: restarting server
Nov 13 00:00:30 a520mk pveproxy[1433]: starting 3 worker(s)
Nov 13 00:00:30 a520mk pveproxy[1433]: worker 193321 started
Nov 13 00:00:30 a520mk pveproxy[1433]: worker 193322 started
Nov 13 00:00:30 a520mk pveproxy[1433]: worker 193323 started
Nov 13 00:00:35 a520mk spiceproxy[1440]: worker exit
Nov 13 00:00:35 a520mk spiceproxy[1439]: worker 1440 finished
Nov 13 00:00:35 a520mk pveproxy[1436]: worker exit
Nov 13 00:00:35 a520mk pveproxy[1434]: worker exit
Nov 13 00:00:35 a520mk pveproxy[1435]: worker exit
Nov 13 00:00:35 a520mk pveproxy[1433]: worker 1436 finished
Nov 13 00:00:35 a520mk pveproxy[1433]: worker 1435 finished
Nov 13 00:00:35 a520mk pveproxy[1433]: worker 1434 finished
Nov 13 00:17:01 a520mk CRON[197052]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 00:17:01 a520mk CRON[197053]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 13 00:17:01 a520mk CRON[197052]: pam_unix(cron:session): session closed for user root
Nov 13 00:20:27 a520mk smartd[1032]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 71 to 70
Nov 13 00:24:01 a520mk CRON[198595]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 00:24:01 a520mk CRON[198596]: (root) CMD (if [ $(date +%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ]; then /usr/lib/zfs-linux/scrub; fi)
Nov 13 00:24:01 a520mk CRON[198595]: pam_unix(cron:session): session closed for user root
Nov 13 00:50:27 a520mk smartd[1032]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 70 to 71
Nov 13 01:17:01 a520mk CRON[210395]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 01:17:01 a520mk CRON[210396]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 13 01:17:01 a520mk CRON[210395]: pam_unix(cron:session): session closed for user root
Nov 13 02:17:01 a520mk CRON[223741]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 02:17:01 a520mk CRON[223742]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 13 02:17:01 a520mk CRON[223741]: pam_unix(cron:session): session closed for user root
Nov 13 02:20:27 a520mk smartd[1032]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 71 to 70
Nov 13 02:50:27 a520mk smartd[1032]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 70 to 71
Nov 13 03:10:01 a520mk CRON[235533]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 03:10:01 a520mk CRON[235534]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Nov 13 03:10:01 a520mk CRON[235533]: pam_unix(cron:session): session closed for user root
Nov 13 03:17:01 a520mk CRON[237083]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 13 03:17:01 a520mk CRON[237084]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Nov 13 03:17:01 a520mk CRON[237083]: pam_unix(cron:session): session closed for user root
-- Reboot --
Nov 13 04:15:24 a520mk kernel: Linux version 6.8.12-4-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-4 (2024-11-06T15:04Z) ()
Nov 13 04:15:24 a520mk kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-4-pve root=/dev/mapper/pve-root ro quiet
Nov 13 04:15:24 a520mk kernel: KERNEL supported cpus:
Nov 13 04:15:24 a520mk kernel:   Intel GenuineIntel
Nov 13 04:15:24 a520mk kernel:   AMD AuthenticAMD
...

The system has been stable for months but yesterday I did an update which also updated to Linux version 6.8.12-4-pve.

The system specs are as follows:
SCR-20241113-jtze.png

If it happens again, what could I do to narrow down the problem?
 
Hmm, don't see anything peculiar in the logs.

What special activity (backups etc) was going on at about the time of the reboot?

How many VMS/LXCS were running at the time?

Do you have any GPU passthrough going on? I remember reading once (long time ago) about problems with the 4600g passthrough.

If you encounter this again - you may have to pin to a previous working kernel.
 
  • Like
Reactions: chevybeef
Hmm, don't see anything peculiar in the logs.

What special activity (backups etc) was going on at about the time of the reboot?

How many VMS/LXCS were running at the time?

Do you have any GPU passthrough going on? I remember reading once (long time ago) about problems with the 4600g passthrough.

If you encounter this again - you may have to pin to a previous working kernel.

There weren't any backup jobs or special activity happening at that time.

4 VM's and 1 CT were running at the time. Load around 4%.

No GPU passthrough but I do have a HBA passed through and sometimes when rebooting TrueNAS it doesn't find the pool and I have to reboot pve to get it working again.

Thanks, I have noted below which kernel was running before the update so that I can pin that one if it happens again.

Nov 10 11:20:44 a520mk kernel: Linux version 6.8.12-3-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-3 (2024-10-23T11:41Z) ()