Proxmox Installation rebooting every couple minutes

DidiSkywalker

New Member
Oct 5, 2024
8
0
1
Hi everyone,
I recently bought a secondhand HP ProDesk 600 G4 DM with an i5-85000T, 16GB RAM and a 256GB M.2 SSD as a beginner homelab. Installed is Proxmox VE 8.2.2.
I am very new to this and just want to play around a bit, maybe host a couple game servers. Ideally in the future have a media server running. That's why I decided to use Proxmox to let me try out things and just delete a VM when I break it and try again.
However, from the beginning my installation has been rebooting every couple minutes to maybe an hour between. So far all I've setup are 2 LXCs: one running Homarr and one running Crafty Controller. But even without these running it keeps rebooting.
Screenshot 2024-10-11 103436.png
For one stretch of about 10 days things were fine and I thought it somehow fixed it by itself, but out of nowhere the reboots came back.
From one forum post here I learned to look at journalctl -p err which yields:
1728638934966.png
But from what I've read SGX is supposed to be disabled? Also the long jump here is from me shutting the PC down and trying again today.
And attached is the output of journalctl -b -2 which is one of those random boots where I did nothing.

Now I'm wondering whether I configured something wrong or is the secondhand hardware just busted?
Any ideas what I could try would be very appreciated, thanks!
 

Attachments

  • proxmox-lastboot.txt
    119.2 KB · Views: 2
SGX is deprecated, so better leave it off.

from the beginning my installation has been rebooting every couple minutes to maybe an hour between.
Really Rebooting or reset? The message directly on reboot/reset would be interesting!

  • Memory and M.2 SSD are ok? -> Memtest?
  • What type of storage did you have in use?
  • You can also filter with "journalctl -p3 or p4"
  • What is the output of dmesg -l 3 and dmesg -l4
 
I read somewhere to grep "Shutting down" in journalctl and that only found a couple lines which matched with manual shutdowns, so I think it's crashing.
  • I just ran Memtest which passed and smartctl says the M.2 is healthy.
  • I kept storage the way it was setup by default. "local" of type Directory and "local-lvm" of type LVM-Thin.
  • p3 lists the same SGX errors, p4 similar to dmesg output:
  • 1728659116553.png
 
Update: yesterday at around 4pm I disabled Wake-on-LAN in the BIOS and I thought that might've been it, because it kept running while I was looking after that. But today I woke up to see it only ran until around 9pm and then went back to 5-30min resets throughout the night.
Here are the last couple lines of one of these boots:
Code:
root@torbenprox:~# journalctl -b -3 -e
Oct 12 07:03:02 torbenprox systemd[1]: Started pvescheduler.service - Proxmox VE scheduler.
Oct 12 07:03:02 torbenprox systemd[1]: Reached target multi-user.target - Multi-User System.
Oct 12 07:03:02 torbenprox systemd[1]: Reached target graphical.target - Graphical Interface.
Oct 12 07:03:02 torbenprox systemd[1]: Starting systemd-update-utmp-runlevel.service - Record Runlevel Change in UTMP...
Oct 12 07:03:02 torbenprox systemd[1]: systemd-update-utmp-runlevel.service: Deactivated successfully.
Oct 12 07:03:02 torbenprox systemd[1]: Finished systemd-update-utmp-runlevel.service - Record Runlevel Change in UTMP.
Oct 12 07:03:02 torbenprox systemd[1]: Startup finished in 4.747s (firmware) + 7.442s (loader) + 2.161s (kernel) + 54.788s (userspace) = 1min 9.140s.
Oct 12 07:03:05 torbenprox kernel: cgroup: Setting release_agent not allowed
Oct 12 07:03:06 torbenprox kernel: overlayfs: fs on '/var/lib/docker/overlay2/check-overlayfs-support196795192/lower2' does not support file handles, falling back to xino=off.
Oct 12 07:03:06 torbenprox kernel: overlayfs: fs on '/var/lib/docker/overlay2/metacopy-check920010019/l1' does not support file handles, falling back to xino=off.
Oct 12 07:03:06 torbenprox kernel: evm: overlay not supported
Oct 12 07:03:06 torbenprox kernel: overlayfs: fs on '/var/lib/docker/overlay2/l/FP6N44FKLJNOXFOLZ2Y4RA2DNC' does not support file handles, falling back to xino=off.
Oct 12 07:03:06 torbenprox kernel: Initializing XFRM netlink socket
Oct 12 07:03:07 torbenprox kernel: overlayfs: fs on '/var/lib/docker/overlay2/l/FP6N44FKLJNOXFOLZ2Y4RA2DNC' does not support file handles, falling back to xino=off.
Oct 12 07:03:07 torbenprox kernel: br-b65bbbef9de3: port 1(vetha4e31bc) entered blocking state
Oct 12 07:03:07 torbenprox kernel: br-b65bbbef9de3: port 1(vetha4e31bc) entered disabled state
Oct 12 07:03:07 torbenprox kernel: vetha4e31bc: entered allmulticast mode
Oct 12 07:03:07 torbenprox kernel: vetha4e31bc: entered promiscuous mode
Oct 12 07:03:07 torbenprox kernel: eth0: renamed from vethbf402b9
Oct 12 07:03:07 torbenprox kernel: br-b65bbbef9de3: port 1(vetha4e31bc) entered blocking state
Oct 12 07:03:07 torbenprox kernel: br-b65bbbef9de3: port 1(vetha4e31bc) entered forwarding state
Oct 12 07:03:26 torbenprox chronyd[738]: Selected source 185.248.188.98 (2.debian.pool.ntp.org)
Oct 12 07:17:01 torbenprox CRON[6028]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 12 07:17:01 torbenprox CRON[6029]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 12 07:17:01 torbenprox CRON[6028]: pam_unix(cron:session): session closed for user root
Oct 12 07:17:15 torbenprox systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Oct 12 07:17:15 torbenprox systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Oct 12 07:17:15 torbenprox systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Oct 12 07:17:15 torbenprox systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Oct 12 07:20:15 torbenprox systemd[1]: Starting apt-daily.service - Daily apt download activities...
Oct 12 07:20:16 torbenprox systemd[1]: apt-daily.service: Deactivated successfully.
Oct 12 07:20:16 torbenprox systemd[1]: Finished apt-daily.service - Daily apt download activities.
 
Update: yesterday at around 4pm I disabled Wake-on-LAN in the BIOS and I thought that might've been it, because it kept running while I was looking after that. But today I woke up to see it only ran until around 9pm and then went back to 5-30min resets throughout the night.
Here are the last couple lines of one of these boots:

Unfortunately there really isn't anything distinctive there.

However, from the beginning my installation has been rebooting every couple minutes to maybe an hour between.

But this is really strange, I just really have to wonder, have you e.g. tried older kernel or even just plain Debian to see if you get to experience the same? It's a bit hard to be blanket looking for both hardware and software issue.
 
NB This is a poor reason to single out Proxmox VE as your primary option. You can do what you described even on plain Ubuntu or Fedora.
Fair enough.

I'll backup what I have so far and try installing plain Debian tomorrow probably.
 
Fair enough.

I'll backup what I have so far and try installing plain Debian tomorrow probably.

Don't get me wrong, it might not be the kernel, but if you install e.g. regular Debian, you get I believe something like 6.1. If you search other threads here on the currently one shipped with PVE, it's been hit and miss. It is possible to pin older one, but all in all, it is simpler to run e.g. Debian. If you want more usable system, then Ubuntu, which is Debian based. PVE basically uses Ubuntu's kernel (because ZFS and LXC support). Fedora is different, but probably nicer for a new user.

What is important is that in any of the cases above, you can have your cake and eat it too. You will be testing different OS/kernel and you can have VMs. Either get libvirt (look for "virt-manager" for distro you choose, it will give you QEMU VMs just as PVE does), or go for Incus (which gives you containers alike -those are the same LXCs that PVE provides, a bit work to migrate, but possible). If you go with libvirt and look for GUI (do not forget you will have entire desktop distro, e.g. you can have it with GNOME already), for web-style kind look for Cockpit (for libvirt) or Canonical's Web UI for LXD (which turned into Incus).

These are all just different wrappers for exactly the same technologies PVE ships. You will lack some features like clustering, and it will be less point and click, but more learning. All whilst testing your hardware.
 
Last edited:
  • Like
Reactions: DidiSkywalker
Just installed a fresh Debian 12 which came with kernel 6.1 as you said.
And it reset after 5 minutes.
But only after I actually logged into Debian. I spent a good 30-45 minutes in setup and another 30 minutes running the HP storage and memory checks - which passed - and it didn't reset then.
Very weird. I guess I'll have to start asking elsewhere for help.
 
Just installed a fresh Debian 12 which came with kernel 6.1 as you said.
And it reset after 5 minutes.

That's a bummer indeed.

But only after I actually logged into Debian. I spent a good 30-45 minutes in setup and another 30 minutes running the HP storage and memory checks - which passed - and it didn't reset then.

The last thing I would try is running Debian or Ubuntu LIVE, i.e. off USB stick. If that holds well you know it's the storage (also would explain why there's nothing flushed into the logs).
 
BTW It could be even just PSU, not sure if you have some other to test with, but HPs had rather normal round plug DC19V (do NOT take me for my word though here, check;))...
 
Ubuntu LIVE ran for 9 hours through the night now. Thanks for the tip to try that.
So I understand this correctly: replacing the M.2 should do it then? I know there's never a guarantee, but is this the likely implication?
 
Ubuntu LIVE ran for 9 hours through the night now. Thanks for the tip to try that.
So I understand this correctly: replacing the M.2 should do it then? I know there's never a guarantee, but is this the likely implication?

If you gave it also some load, e.g. play YouTube video all night long, then I would guess it's the SSD. :) You can also - since you are not running off it, start moving data around and see journalctl/dmesg. Ever checked its smart values? If it's NVMe you can check nvme smart-log:
https://manpages.ubuntu.com/manpages/noble/man1/nvme.1.html

BTW You can totally apt install things on LIVE Ubuntu, I don't remember there was some glitch with sources.list (some packages might need add-apt-repository universe), but it's possible, or just even donwload and dpkg it.
 
Last edited:
It ran a cronjob every minute to update an html file I can check its uptime with but that's not really a load. I can do that throughout today to make sure.
I checked smartctl before which said it's healthy and the builtin HP storage check didn't find anything either. Here's nvme smart-log output:
Code:
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning                        : 0
temperature                             : 29 C (302 Kelvin)
available_spare                         : 100%
available_spare_threshold               : 5%
percentage_used                         : 9%
endurance group critical warning summary: 0
data_units_read                         : 13.821.356
data_units_written                      : 9.094.856
host_read_commands                      : 194.041.406
host_write_commands                     : 213.089.969
controller_busy_time                    : 1.472
power_cycles                            : 1.649
power_on_hours                          : 1.018
unsafe_shutdowns                        : 1.007
media_errors                            : 0
num_err_log_entries                     : 0
Warning Temperature Time                : 0
Critical Composite Temperature Time     : 0
Temperature Sensor 1           : 29 C (302 Kelvin)
Temperature Sensor 2           : 28 C (301 Kelvin)
Thermal Management T1 Trans Count       : 0
Thermal Management T2 Trans Count       : 0
Thermal Management T1 Total Time        : 0
Thermal Management T2 Total Time        : 0

I've been meaning to upgrade storage eventually anyway, so I'll do that then see if it helps.
Thank you again for your quick replies and your help so far!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!