Random Reboots - what to try next?

koyaan134

New Member
Nov 20, 2023
1
0
1
Hoping someone could point me in the right direction with some issues I've been having with Proxmox (I'm a beginner).

I recently moved my server and updated Proxmox to the latest kernel. The next day, I noticed that Proxmox was randomly rebooting at sporadic times (between every 10 minutes and 1 hour).

At the same time, the thread co-processor I attached via usb was failing to be recognized, and I noticed an error about usb power. I removed the usb but the reboots persisted.

I also tried:

  • Running memtest
  • Trying new PSU / different outlet on power strip
  • Downgrading to use an older kernel
  • Updating BIOS
  • Checking system resources / temperatures
All were normal and didn't change the reboot problem.

I tried digging through syslog but I'm not really sure what to look for. What I can say is that I'm not seeing any critical errors before the reboot occurs. I do see this at setup:

Code:
ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [CAP1] at bit offset/length 64/32 exceeds size of target Buffer (64 bits)
ACPI Error: Aborting method \_SB._OSC due to previous error (AE_AML_BUFFER_LIMIT)

But I'm not sure if that has any bearing on this problem. Any idea where I should start? Happy to do another syslog dump if needed
 
I'm also experiencing the random reboots but not as often as you. I have 3 identical servers, so far one has been up for 28 days and the other two are lasting between 5 days and 2 weeks before they reboot (-- Reboot --). I'm following along with your thread cause it's driving us crazy trying to figure it out. I've done all the test you've done, even swapped RAM between the servers and still no closer to finding out what is triggering it.
 
I'm also experiencing the random reboots but not as often as you. I have 3 identical servers, so far one has been up for 28 days and the other two are lasting between 5 days and 2 weeks before they reboot (-- Reboot --). I'm following along with your thread cause it's driving us crazy trying to figure it out. I've done all the test you've done, even swapped RAM between the servers and still no closer to finding out what is triggering it.

How about sharing the last log entries before the new boot log starts?
 
How about sharing the last log entries before the new boot log starts?

Nothing really stands out before the --Reboot--

Code:
Sep 18 00:00:46 pmx01 systemd[1]: Finished logrotate.service - Rotate log files.
Sep 18 00:17:01 pmx01 CRON[837996]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 00:17:01 pmx01 CRON[837997]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 00:17:01 pmx01 CRON[837996]: pam_unix(cron:session): session closed for user root
Sep 18 00:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 01:17:01 pmx01 CRON[858759]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 01:17:01 pmx01 CRON[858760]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 01:17:01 pmx01 CRON[858759]: pam_unix(cron:session): session closed for user root
Sep 18 01:26:46 pmx01 systemd[1]: Starting pve-daily-update.service - Daily PVE download activities...
Sep 18 01:26:48 pmx01 pveupdate[862120]: <root@pam> starting task UPID:pmx01:000D27AD:00DFBC7D:66E99FB8:aptupdate::root@pam:
Sep 18 01:26:49 pmx01 pveupdate[862125]: update new package list: /var/lib/pve-manager/pkgupdates
Sep 18 01:26:50 pmx01 pveupdate[862120]: <root@pam> end task UPID:pmx01:000D27AD:00DFBC7D:66E99FB8:aptupdate::root@pam: OK
Sep 18 01:26:50 pmx01 pveupdate[862120]: Custom certificate does not expire soon, skipping ACME renewal.
Sep 18 01:26:50 pmx01 systemd[1]: pve-daily-update.service: Deactivated successfully.
Sep 18 01:26:50 pmx01 systemd[1]: Finished pve-daily-update.service - Daily PVE download activities.
Sep 18 01:26:50 pmx01 systemd[1]: pve-daily-update.service: Consumed 2.074s CPU time.
Sep 18 01:28:57 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 01:28:59 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 01:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 02:00:04 pmx01 pvescheduler[873958]: <root@pam> starting task UPID:pmx01:000D55E7:00E2C813:66E9A784:vzdump:1200:root@pam:
Sep 18 02:00:04 pmx01 pvescheduler[873959]: INFO: starting new backup job: vzdump 1200 --quiet 1 --notes-template '{{guestname}}' --fleecing 0 --mailnotification failure --node pmx01 --mode snapshot --prune-backups 'keep-daily=7,keep-monthly=3,keep-weekly=4' --storage PBS-BACKUPS --mailto xxx@xxx
Sep 18 02:00:04 pmx01 pvescheduler[873959]: INFO: Starting Backup of VM 1200 (lxc)
Sep 18 02:00:04 pmx01 dmeventd[626]: No longer monitoring thin pool fast--vm-fast--vm-tpool.
Sep 18 02:00:04 pmx01 dmeventd[626]: Monitoring thin pool fast--vm-fast--vm-tpool.
Sep 18 02:00:05 pmx01 kernel: EXT4-fs (dm-17): mounted filesystem 43d50acd-a6b1-4b33-ae48-ddaeafff1782 ro without journal. Quota mode: none.
Sep 18 02:00:12 pmx01 kernel: EXT4-fs (dm-17): unmounting filesystem 43d50acd-a6b1-4b33-ae48-ddaeafff1782.
Sep 18 02:00:13 pmx01 pvescheduler[873959]: INFO: Finished Backup of VM 1200 (00:00:09)
Sep 18 02:00:13 pmx01 pvescheduler[873959]: INFO: Backup job finished successfully
Sep 18 02:17:01 pmx01 CRON[880038]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 02:17:01 pmx01 CRON[880039]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 02:17:01 pmx01 CRON[880038]: pam_unix(cron:session): session closed for user root
Sep 18 02:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 02:55:27 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 02:55:29 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 03:00:00 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 03:10:01 pmx01 CRON[898363]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 03:10:01 pmx01 CRON[898364]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Sep 18 03:10:01 pmx01 CRON[898363]: pam_unix(cron:session): session closed for user root
Sep 18 03:17:01 pmx01 CRON[900778]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 03:17:01 pmx01 CRON[900779]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 03:17:01 pmx01 CRON[900778]: pam_unix(cron:session): session closed for user root
Sep 18 03:34:29 pmx01 kernel: vmbr0: left promiscuous mode
Sep 18 03:34:29 pmx01 wol_hack.sh[1086]: Captured magic packet for address: "00:00:00:00:03:70"
Sep 18 03:34:29 pmx01 wol_hack.sh[1086]: Looking for existing VM: 0 found
Sep 18 03:34:29 pmx01 wol_hack.sh[1086]: Looking for existing LXC: 0 found
Sep 18 03:34:34 pmx01 kernel: vmbr0: entered promiscuous mode
Sep 18 03:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 04:00:03 pmx01 pmxcfs[1188]: [status] notice: received log
Sep 18 04:17:01 pmx01 CRON[921520]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 04:17:01 pmx01 CRON[921521]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 04:17:01 pmx01 CRON[921520]: pam_unix(cron:session): session closed for user root
Sep 18 04:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 04:48:46 pmx01 systemd[1]: Starting apt-daily.service - Daily apt download activities...
Sep 18 04:48:46 pmx01 systemd[1]: apt-daily.service: Deactivated successfully.
Sep 18 04:48:46 pmx01 systemd[1]: Finished apt-daily.service - Daily apt download activities.
Sep 18 05:17:01 pmx01 CRON[942292]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 05:17:01 pmx01 CRON[942293]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 05:17:01 pmx01 CRON[942292]: pam_unix(cron:session): session closed for user root
Sep 18 05:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 06:17:01 pmx01 CRON[963005]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 06:17:01 pmx01 CRON[963006]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 06:17:01 pmx01 CRON[963005]: pam_unix(cron:session): session closed for user root
Sep 18 06:25:01 pmx01 CRON[965769]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 06:25:01 pmx01 CRON[965770]: (root) CMD (test -x /usr/sbin/anacron || { cd / && run-parts --report /etc/cron.daily; })
Sep 18 06:25:01 pmx01 CRON[965769]: pam_unix(cron:session): session closed for user root
Sep 18 06:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 06:51:46 pmx01 systemd[1]: Starting apt-daily-upgrade.service - Daily apt upgrade and clean activities...
Sep 18 06:51:46 pmx01 systemd[1]: apt-daily-upgrade.service: Deactivated successfully.
Sep 18 06:51:46 pmx01 systemd[1]: Finished apt-daily-upgrade.service - Daily apt upgrade and clean activities.
Sep 18 07:17:01 pmx01 CRON[983794]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 07:17:01 pmx01 CRON[983795]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 07:17:01 pmx01 CRON[983794]: pam_unix(cron:session): session closed for user root
Sep 18 07:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 08:17:01 pmx01 CRON[1004520]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 08:17:01 pmx01 CRON[1004521]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 08:17:01 pmx01 CRON[1004520]: pam_unix(cron:session): session closed for user root
Sep 18 08:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 09:00:47 pmx01 systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Sep 18 09:00:47 pmx01 systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Sep 18 09:00:47 pmx01 systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Sep 18 09:00:47 pmx01 systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Sep 18 09:17:01 pmx01 CRON[1026320]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 18 09:17:01 pmx01 CRON[1026321]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 18 09:17:01 pmx01 CRON[1026320]: pam_unix(cron:session): session closed for user root
Sep 18 09:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 09:43:46 pmx01 systemd[1]: Starting man-db.service - Daily man-db regeneration...
Sep 18 09:43:46 pmx01 systemd[1]: man-db.service: Deactivated successfully.
Sep 18 09:43:46 pmx01 systemd[1]: Finished man-db.service - Daily man-db regeneration.
Sep 18 10:07:46 pmx01 systemd[1]: Starting apt-daily.service - Daily apt download activities...
Sep 18 10:07:46 pmx01 systemd[1]: apt-daily.service: Deactivated successfully.
Sep 18 10:07:46 pmx01 systemd[1]: Finished apt-daily.service - Daily apt download activities.
-- Reboot --
Sep 18 10:09:45 pmx01 kernel: Linux version 6.8.12-1-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-1 (2024-08-05T16:17Z) ()
Sep 18 10:09:45 pmx01 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-1-pve root=/dev/mapper/pve-root ro quiet iommu=pt pci=assign-busses apicmaintimer idle=poll reboot=cold,hard
Sep 18 10:09:45 pmx01 kernel: KERNEL supported cpus:
Sep 18 10:09:45 pmx01 kernel:   Intel GenuineIntel
Sep 18 10:09:45 pmx01 kernel:   AMD AuthenticAMD
Sep 18 10:09:45 pmx01 kernel:   Hygon HygonGenuine
Sep 18 10:09:45 pmx01 kernel:   Centaur CentaurHauls
Sep 18 10:09:45 pmx01 kernel:   zhaoxin   Shanghai
Sep 18 10:09:45 pmx01 kernel: BIOS-provided physical RAM map:
 
Nothing really stands out before the --Reboot--

Code:
Sep 18 09:43:10 pmx01 pmxcfs[1188]: [dcdb] notice: data verification successful
Sep 18 09:43:46 pmx01 systemd[1]: Starting man-db.service - Daily man-db regeneration...
Sep 18 09:43:46 pmx01 systemd[1]: man-db.service: Deactivated successfully.
Sep 18 09:43:46 pmx01 systemd[1]: Finished man-db.service - Daily man-db regeneration.
Sep 18 10:07:46 pmx01 systemd[1]: Starting apt-daily.service - Daily apt download activities...
Sep 18 10:07:46 pmx01 systemd[1]: apt-daily.service: Deactivated successfully.
Sep 18 10:07:46 pmx01 systemd[1]: Finished apt-daily.service - Daily apt download activities.
-- Reboot --
Sep 18 10:09:45 pmx01 kernel: Linux version 6.8.12-1-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-1 (2024-08-05T16:17Z) ()
Sep 18 10:09:45 pmx01 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-1-pve root=/dev/mapper/pve-root ro quiet iommu=pt pci=assign-busses apicmaintimer idle=poll reboot=cold,hard
Sep 18 10:09:45 pmx01 kernel: KERNEL supported cpus:
Sep 18 10:09:45 pmx01 kernel:   Intel GenuineIntel
Sep 18 10:09:45 pmx01 kernel:   AMD AuthenticAMD
Sep 18 10:09:45 pmx01 kernel:   Hygon HygonGenuine
Sep 18 10:09:45 pmx01 kernel:   Centaur CentaurHauls
Sep 18 10:09:45 pmx01 kernel:   zhaoxin   Shanghai
Sep 18 10:09:45 pmx01 kernel: BIOS-provided physical RAM map:

May I ask where is this log from? I am a bit surprised but the "-- Reboot --", I think it has not been used by systemd-journald for a while (it now appears as "-- Boot $ID" in the log deliminer. Are these recently updated nodes? If so, how were they updated?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!