Crash random proxmox

frenk970

Member
Jan 20, 2020
60
3
13
26
hello,
I have random reboot of proxmox and I don't understand what caused it, this is the log before the crash: https://pastebin.com/k8mHhtTa , can you help me understand?

proxmox always reboots with that kind of log strings before reboot is written
 
I don't see anything special. It just runs the hourly crons before the reboot. You could check if there is anything in "/etc/cron.hourly" that might cause troubles.
 
I don't see anything special. It just runs the hourly crons before the reboot. You could check if there is anything in "/etc/cron.hourly" that might cause troubles.
but the reboot always occurs in conjunction with those lo strings
 
I have a VM with GPU passthrough, could it be that the GPU is requested and crashes the system since it is busy?
however I don't see problems of this type in the log and even problems of low RAM if that were not enough for ProxMox (I have occupied 18GB/32GB)
 
Did you check with memtest86+ if your RAM is healthy?
Did you check if the PSU is healthy? For example running a GPU + CPU benchmark using alot of electricity doesn't cause reboots?
Most of the time when a system is unstable its one of those two.
 
Last edited:
Did you check with memtest86+ if your RAM is healthy?
Did you check if the PSU is healthy? For example running a GPU + CPU benchmark using alot of electricity doesn't cause reboots?
Most of the time when a system is unstable its one of those two.
no, I hadn't thought about it, I immediately do the recommended tests
 
You could also run a long smart selftest (smartctl -t long /dev/yourDisk) to check if your disks are fine. And updaing the BIOS/UEFI might help too if there was a known firmware problem that is fixed meanwhile.
And in case you are using ZFS you could initialize a scrub (zpool scrub rpool) to see if maybe some files of the OS got corrupted.
And you could try the 5.15 kernel instead of the 5.13 in case you are using very modern hardware. That sometimes fixes problems caused by new hardware.
 
Last edited:
You could also run a long smart selftest (smartctl -t long /dev/yourDisk) to check if your disks are fine. And updaing the BIOS/UEFI might help too if there was a known firmware problem that is fixed meanwhile.
And in case you are using ZFS you could initialize a scrub (zpool scrub rpool) to see if maybe some files of the OS got corrupted.
And you could try the 5.15 kernel instead of the 5.13 in case you are using very modern hardware. That sometimes fixes problems caused by new hardware.
I am doing the disk tests now, but I have checked and there are no BIOS / UEFI updates and they are at the latest version
 
You could also run a long smart selftest (smartctl -t long /dev/yourDisk) to check if your disks are fine. And updaing the BIOS/UEFI might help too if there was a known firmware problem that is fixed meanwhile.
And in case you are using ZFS you could initialize a scrub (zpool scrub rpool) to see if maybe some files of the OS got corrupted.
And you could try the 5.15 kernel instead of the 5.13 in case you are using very modern hardware. That sometimes fixes problems caused by new hardware.
zpool scrub rpool reported no errorsSchermata 2022-03-21 alle 07.45.40.png
 
test with memtest86+ passed, so I can exclude the RAM, hopefully it's not the power supply.
The disk tests passed them all.
The server looks perfect
 
Schermata 2022-03-21 alle 15.07.44.pngSchermata 2022-03-21 alle 15.08.01.pngSchermata 2022-03-21 alle 15.08.18.png
Schermata 2022-03-21 alle 15.09.09.pngSchermata 2022-03-21 alle 15.09.09.pngSchermata 2022-03-21 alle 15.10.03.png
the configuration is this and it is my old gaming PC and I only use it for home tests, could it be that the power supply does not hold up?
 
I have this exact same issue. The log looks exactly the same aswell.
my machine crashes every night at the same time, the last entries are the Cron entries mentioned above.
I will try to clear my hourly cron jobs and see if that results in any improvement.
 
I believe I have the same issue. Log looks very similar, i never get any error messages but the sys log before the crash/reboot seems the same every time.


Code:
Mar  4 03:03:46 pve pvedaemon[468000]: <root@pam> successful auth for user 'root@pam'
Mar  4 03:05:54 pve pveproxy[481032]: worker exit
Mar  4 03:05:54 pve pveproxy[2988]: worker 481032 finished
Mar  4 03:05:54 pve pveproxy[2988]: starting 1 worker(s)
Mar  4 03:05:54 pve pveproxy[2988]: worker 489205 started
Mar  4 03:06:07 pve pveproxy[2988]: worker 480114 finished
Mar  4 03:06:07 pve pveproxy[2988]: starting 1 worker(s)
Mar  4 03:06:07 pve pveproxy[2988]: worker 489239 started
Mar  4 03:06:10 pve pveproxy[489238]: got inotify poll request in wrong process - disabling inotify
Mar  4 03:06:10 pve pveproxy[489238]: worker exit
Mar  4 03:10:01 pve CRON[489947]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Mar  4 03:12:17 pve pvedaemon[462470]: <root@pam> successful auth for user 'root@pam'
Mar  4 03:14:34 pve pveproxy[482304]: worker exit
Mar  4 03:14:34 pve pveproxy[2988]: worker 482304 finished
Mar  4 03:14:34 pve pveproxy[2988]: starting 1 worker(s)
Mar  4 03:14:34 pve pveproxy[2988]: worker 490743 started
Mar  4 03:17:01 pve CRON[491174]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Mar  4 03:18:30 pve postfix/qmgr[2939]: 425A91F2E3: from=<root@pve.>, size=1223, nrcpt=1 (queue active)
Mar  4 03:18:30 pve postfix/qmgr[2939]: 440A3B0081: from=<root@pve.>, size=1223, nrcpt=1 (queue active)
Mar  4 03:18:31 pve postfix/smtp[491440]: connect to gmail-smtp-in.l.google.com[2404:6800:4003:c05::1a]:25: Network is unreachable
Mar  4 03:18:46 pve pvedaemon[468454]: <root@pam> successful auth for user 'root@pam'
Mar  4 03:19:01 pve postfix/smtp[491440]: connect to gmail-smtp-in.l.google.com[172.253.118.27]:25: Connection timed out
Mar  4 03:19:01 pve postfix/smtp[491439]: connect to gmail-smtp-in.l.google.com[172.253.118.27]:25: Connection timed out
Mar  4 03:19:01 pve postfix/smtp[491439]: connect to gmail-smtp-in.l.google.com[2404:6800:4003:c05::1a]:25: Network is unreachable
Mar  4 03:19:01 pve postfix/smtp[491439]: connect to alt1.gmail-smtp-in.l.google.com[2607:f8b0:400e:c00::1b]:25: Network is unreachable
Mar  4 03:19:31 pve postfix/smtp[491440]: connect to alt1.gmail-smtp-in.l.google.com[173.194.202.26]:25: Connection timed out
Mar  4 03:19:31 pve postfix/smtp[491440]: connect to alt1.gmail-smtp-in.l.google.com[2607:f8b0:400e:c00::1b]:25: Network is unreachable
Mar  4 03:19:31 pve postfix/smtp[491440]: connect to alt2.gmail-smtp-in.l.google.com[2607:f8b0:4023:c0b::1a]:25: Network is unreachable
Mar  4 03:19:31 pve postfix/smtp[491439]: connect to alt1.gmail-smtp-in.l.google.com[173.194.202.26]:25: Connection timed out
Mar  4 03:20:01 pve postfix/smtp[491439]: connect to alt2.gmail-smtp-in.l.google.com[142.250.141.27]:25: Connection timed out
<Crash/Reboot>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!