Why did my Proxmox server reboot?

KirikParty

New Member
Apr 8, 2024
11
1
3
I have a single node running 3 VM's and 2 CT's. Its been mostly stable so far. However this morning the server reboot itself.
Pve version: 8.2.2
Kernel Version: 6.8.4-3

I have pasted the syslog when it rebooted. Why did it reboot?

Code:
May 09 04:17:01 pve-prod2 CRON[1118133]: pam_unix(cron:session): session closed for user root
May 09 04:27:48 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 70 to 71
May 09 05:17:01 pve-prod2 CRON[1130434]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 05:17:01 pve-prod2 CRON[1130435]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 09 05:17:01 pve-prod2 CRON[1130434]: pam_unix(cron:session): session closed for user root
May 09 06:17:01 pve-prod2 CRON[1142598]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 06:17:01 pve-prod2 CRON[1142599]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 09 06:17:01 pve-prod2 CRON[1142598]: pam_unix(cron:session): session closed for user root
May 09 06:20:10 pve-prod2 systemd[1]: Starting apt-daily-upgrade.service - Daily apt upgrade and clean activities...
May 09 06:20:10 pve-prod2 systemd[1]: apt-daily-upgrade.service: Deactivated successfully.
May 09 06:20:10 pve-prod2 systemd[1]: Finished apt-daily-upgrade.service - Daily apt upgrade and clean activities.
May 09 06:25:01 pve-prod2 CRON[1144192]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 06:25:01 pve-prod2 CRON[1144193]: (root) CMD (test -x /usr/sbin/anacron || { cd / && run-parts --report /etc/cron.daily; })
May 09 06:25:01 pve-prod2 CRON[1144192]: pam_unix(cron:session): session closed for user root
May 09 06:27:49 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 71 to 67
May 09 06:57:48 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 67 to 70
May 09 07:17:01 pve-prod2 CRON[1154644]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 07:17:01 pve-prod2 CRON[1154645]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 09 07:17:01 pve-prod2 CRON[1154644]: pam_unix(cron:session): session closed for user root
May 09 07:57:48 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 70 to 69
May 09 08:17:01 pve-prod2 CRON[1166734]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 08:17:01 pve-prod2 CRON[1166735]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 09 08:17:01 pve-prod2 CRON[1166734]: pam_unix(cron:session): session closed for user root
May 09 08:27:49 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 69 to 67
May 09 08:57:48 pve-prod2 systemd[1]: Starting apt-daily.service - Daily apt download activities...
May 09 08:57:48 pve-prod2 systemd[1]: apt-daily.service: Deactivated successfully.
May 09 08:57:48 pve-prod2 systemd[1]: Finished apt-daily.service - Daily apt download activities.
May 09 09:17:01 pve-prod2 CRON[1178892]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 09 09:17:01 pve-prod2 CRON[1178893]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 09 09:17:01 pve-prod2 CRON[1178892]: pam_unix(cron:session): session closed for user root
May 09 09:57:48 pve-prod2 smartd[1509]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 67 to 69
-- Reboot --
May 09 10:06:03 pve-prod2 kernel: Linux version 6.8.4-3-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-3 (2024-05-02T11:55Z) ()
May 09 10:06:03 pve-prod2 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.4-3-pve root=/dev/mapper/pve-root ro quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
May 09 10:06:03 pve-prod2 kernel: KERNEL supported cpus:
May 09 10:06:03 pve-prod2 kernel:   Intel GenuineIntel
May 09 10:06:03 pve-prod2 kernel:   AMD AuthenticAMD
May 09 10:06:03 pve-prod2 kernel:   Hygon HygonGenuine
May 09 10:06:03 pve-prod2 kernel:   Centaur CentaurHauls
May 09 10:06:03 pve-prod2 kernel:   zhaoxin   Shanghai 
May 09 10:06:03 pve-prod2 kernel: BIOS-provided physical RAM map:
May 09 10:06:03 pve-prod2 kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
May 09 10:06:03 pve-prod2 kernel: BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
May 09 10:06:03 pve-prod2 kernel: BIOS-e820: [mem 0x0000000000100000-0x0000000009bfefff] usable
May 09 10:06:03 pve-prod2 kernel: BIOS-e820: [mem 0x0000000009bff000-0x0000000009ffffff] reserved
May 09 10:06:03 pve-prod2 kernel: BIOS-e820: [mem 0x000000000a000000-0x000000000a1fffff] usable
 
There is no clue in the log, which is not uncommon if the power failed or the disk that keeps the log (temporarily) disconnected.
Often it's a (temporary) hardware issue or rare memory corruption or overheating or otherwise stressing the hardware a little too much.
Since you use PCI passthrough, it could also be caused by a VM with passthrough, so check their logs.
You could start replacing hardware parts to see if it make a difference but this usually only works when the issue is (easily) reproducible.
 
Thank you. It's not the first two reasons for sure.

I have been trying to lower the power consumption and have played around with powertop a bit on this machine. Would that cause a reboot without any logs as well?
 
I have been trying to lower the power consumption and have played around with powertop a bit on this machine. Would that cause a reboot without any logs as well?
Undervolting the CPU could easily cause a reset because its then has less margins. Like overclocking, you might have made the system unstable in some conditions (which might have occurred and triggered the reboot). Maybe stress-test your setup with a Linux Live USB that specializes for this (which I assume exist) to make sure the hardware is stable?

Please be aware that Proxmox is intended for enterprise hardware not small form-factor energy efficiency, but there are some threads of people having success in this area.
 
Undervolting the CPU could easily cause a reset because its then has less margins. Like overclocking, you might have made the system unstable in some conditions (which might have occurred and triggered the reboot). Maybe stress-test your setup with a Linux Live USB that specializes for this (which I assume exist) to make sure the hardware is stable?

Please be aware that Proxmox is intended for enterprise hardware not small form-factor energy efficiency, but there are some threads of people having success in this area.
Thank you.
I was testing transcoding a video and it crashed as well. So might as well not be stable when stress testing the system. I will look into the Live USB.
I haven't really undervolved CPU/iGPU. I use a AMD system and have just used the Curve Optimiser in PBO.
I however had Deep Sleep enabled in the BIOS. I have diabled it and will see if it stays stable.

Transcoding a video worked after this. Will need to stress test the CPU and GPU and see if it works.
I will report back.
 
I haven't really undervolved CPU/iGPU. I use a AMD system and have just used the Curve Optimiser in PBO.
Curve Optimizer gives more performance with the same power, which is similar to undervolting or overclocking w.r.t. stability.
I however had Deep Sleep enabled in the BIOS. I have diabled it and will see if it stays stable.
I would not expect that to make any difference, but maybe Deep Sleep is not what I think it is.
I will report back.
None of this is Proxmox or even Linux specific, but other people might be interested.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!