Automatic reboots since upgrade from VE7 to VE8

Jacco

New Member
Jul 4, 2023
10
3
3
At July 2nd I updated my Proxmox VE7 cluster to VE8.
Since that update one of the nodes is rebooting automatically, sometimes multiple times a day.
System is fully patched.
There is no error in /var/log/messages, just infomational kernel-boot stuff when the system is starting again.
Memtest86 is not showing any errors.
The system is a HP Elitedesk G4 mini PC (65W) with an i3-8300 CPU and 2 x 16GB memory. It had no issues before the upgrade. And other nodes in the cluster (same configuration, same BIOS, same BIOS-setup) are not rebooting automatically.
Anyone experiencing similar issues?

Code:
root@proxmox02:~# last | grep 'system boot'
reboot   system boot  6.2.16-3-pve     Mon Jul 10 20:20   still running
reboot   system boot  6.2.16-3-pve     Mon Jul 10 18:45 - 19:21  (00:36)
reboot   system boot  6.2.16-3-pve     Mon Jul 10 10:54 - 19:21  (08:26)
reboot   system boot  6.2.16-3-pve     Mon Jul 10 05:51 - 19:21  (13:30)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 18:40 - 19:21 (1+00:40)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 17:45 - 19:21 (1+01:35)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 15:39 - 19:21 (1+03:42)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 15:15 - 15:38  (00:22)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 10:13 - 15:38  (05:25)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 09:59 - 15:38  (05:39)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 08:53 - 15:38  (06:45)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 03:47 - 15:38  (11:50)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 00:46 - 15:38  (14:52)
reboot   system boot  6.2.16-3-pve     Sun Jul  9 00:03 - 15:38  (15:35)
reboot   system boot  6.2.16-3-pve     Sat Jul  8 23:27 - 15:38  (16:10)
reboot   system boot  6.2.16-3-pve     Sat Jul  8 20:41 - 15:38  (18:57)
reboot   system boot  6.2.16-3-pve     Sat Jul  8 18:37 - 15:38  (21:01)
reboot   system boot  6.2.16-3-pve     Sat Jul  8 17:03 - 15:38  (22:35)
reboot   system boot  6.2.16-3-pve     Sat Jul  8 04:54 - 15:38 (1+10:43)
reboot   system boot  6.2.16-3-pve     Wed Jul  5 21:55 - 15:38 (3+17:42)
reboot   system boot  6.2.16-3-pve     Wed Jul  5 21:50 - 15:38 (3+17:48)
reboot   system boot  6.2.16-3-pve     Sun Jul  2 19:13 - 15:38 (6+20:25)
reboot   system boot  5.15.108-1-pve   Sun Jul  2 17:15 - 19:12  (01:57)
reboot   system boot  5.15.104-1-pve   Wed Apr 19 11:42 - 17:14 (74+05:31)
reboot   system boot  5.15.104-1-pve   Wed Apr 19 14:36 - 11:38  (-2:57)
reboot   system boot  5.15.104-1-pve   Wed Apr  5 09:48 - 11:17 (14+01:29)
reboot   system boot  5.15.83-1-pve    Sun Jan 29 12:07 - 09:47 (65+20:40)
reboot   system boot  5.13.19-6-pve    Tue Jun 28 11:57 - 12:05 (215+01:07)
reboot   system boot  5.13.19-6-pve    Tue May  3 17:59 - 11:49 (55+17:50)
 
Hi,
can you share the system journal (via journalctl or /var/log/syslog) from the time before such reboots happen?
 
No relevant messages I guess.
Seems to be completely random.
2 examples:

Code:
Jul 08 03:55:54 proxmox02 pmxcfs[938]: [dcdb] notice: data verification successful
Jul 08 04:07:45 proxmox02 pmxcfs[938]: [status] notice: received log
Jul 08 04:07:48 proxmox02 pmxcfs[938]: [status] notice: received log
Jul 08 04:17:01 proxmox02 CRON[665375]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jul 08 04:17:01 proxmox02 CRON[665376]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Jul 08 04:17:01 proxmox02 CRON[665375]: pam_unix(cron:session): session closed for user root
-- Boot f019f5339c7644f28883052ab5a3559f --
Jul 08 04:54:51 proxmox02 kernel: Linux version 6.2.16-3-pve (tom@sbuild) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PVE >
Jul 08 04:54:51 proxmox02 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.16-3-pve root=/dev/mapper/pve-root ro quiet
Jul 08 04:54:51 proxmox02 kernel: KERNEL supported cpus:

Code:
Jul 09 00:00:02 proxmox02 rsyslogd[667]: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="667" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
Jul 09 00:00:02 proxmox02 systemd[1]: logrotate.service: Deactivated successfully.
Jul 09 00:00:02 proxmox02 systemd[1]: Finished logrotate.service - Rotate log files.
-- Boot f76df93cfdf1410bb920f4f2f60c52b1 --
Jul 09 00:03:25 proxmox02 kernel: Linux version 6.2.16-3-pve (tom@sbuild) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PVE >
Jul 09 00:03:25 proxmox02 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.16-3-pve root=/dev/mapper/pve-root ro quiet
 
Maybe you have luck getting a log via netconsole: https://pve.proxmox.com/wiki/Kernel_Crash_Trace_Log
And you could try booting an older kernel to see if it's related to the new kernel. Is there anything different about the hardware compared to the other nodes or are they exactly the same?
 
Are all these kernels compatible with VE8?

Code:
root@proxmox02:~# uname -a
Linux proxmox02 6.2.16-3-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.16-3 (2023-06-17T05:58Z) x86_64 GNU/Linux
root@proxmox02:~# apt-cache search pve-kernel | grep Latest
pve-kernel-6.1 - Latest Proxmox VE Kernel Image
pve-kernel-6.2 - Latest Proxmox VE Kernel Image
pve-kernel-5.13 - Latest Proxmox VE Kernel Image
pve-kernel-5.15 - Latest Proxmox VE Kernel Image

By the way: nothing was changed in the hardware (or setup) since VE7 was running fine.
 
Last edited:
Are all these kernels compatible with VE8?

Code:
root@proxmox02:~# uname -a
Linux proxmox02 6.2.16-3-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.16-3 (2023-06-17T05:58Z) x86_64 GNU/Linux
root@proxmox02:~# apt-cache search pve-kernel | grep Latest
pve-kernel-6.1 - Latest Proxmox VE Kernel Image
pve-kernel-6.2 - Latest Proxmox VE Kernel Image
pve-kernel-5.13 - Latest Proxmox VE Kernel Image
pve-kernel-5.15 - Latest Proxmox VE Kernel Image

By the way: nothing was changed in the hardware (or setup) since VE7 was running fine.
Yes, but I'd only run them for verifying that the issue lies with the kernel and not for long-term production use, because you won't get any (security) updates for these kernels anymore. See also https://forum.proxmox.com/threads/proxmox-shutting-down-for-no-apparent-reason.130422/post-572078 which gives an easier way to get logs than setting up netconsole.
 
The issue is not that the server is hanging, but it resets itself without a reason (or at least: without a reason I'm aware of). So accessing the logs via a second server is not relevant, I can access the problem-server itself after it's up and running again.
I now selected kernel '5.15.108-1-pve', let's see if the behaviour changes.
 
The issue is not that the server is hanging, but it resets itself without a reason (or at least: without a reason I'm aware of). So accessing the logs via a second server is not relevant, I can access the problem-server itself after it's up and running again.
I now selected kernel '5.15.108-1-pve', let's see if the behaviour changes.
Yes, but if you are connected via network you might see parts of the log you wouldn't see otherwise. The crash could mean those log entries never get synced to the disk.
 
  • Like
Reactions: flames
Yes, but if you are connected via network you might see parts of the log you wouldn't see otherwise. The crash could mean those log entries never get synced to the disk.

I now have a `journalctl -f` running via ssh from another system.
No strange messages and the system is running fine with 5.15.108-1 kernel. No issues so far.
 
I now have a `journalctl -f` running via ssh from another system.
No strange messages and the system is running fine with 5.15.108-1 kernel. No issues so far.
Okay, so the kernel does seem to be the culprit (or at least the real culprit only causes problems in combination with the new kernel). But if you want to try your luck with getting more log for the issue via ssh, you need to run with the problematic kernel and wait for the crash of course.
 
Updated two days ago to kernel 6.2.16-4 and activated this kernel.
No issues so far.
Not sure what caused the resets.

EDIT: Running without problems now for 6 days.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!