Proxmox locks up after few hours

sij4153

New Member
Jun 30, 2022
1
0
1
machine specs
dual X5670
24gb of Ecc ram
Asus z8na-d6 mobo



syslog:
Jul 1 00:37:02 Odin pveproxy[1381]: worker 98206 finished
Jul 1 00:37:02 Odin pveproxy[1381]: starting 1 worker(s)
Jul 1 00:37:02 Odin pveproxy[1381]: worker 107472 started
Jul 1 00:50:22 Odin pveproxy[98205]: worker exit
Jul 1 00:50:22 Odin pveproxy[1381]: worker 98205 finished
Jul 1 00:50:22 Odin pveproxy[1381]: starting 1 worker(s)
Jul 1 00:50:22 Odin pveproxy[1381]: worker 110211 started
Jul 1 00:51:30 Odin pveproxy[98204]: worker exit
Jul 1 00:51:30 Odin pveproxy[1381]: worker 98204 finished
Jul 1 00:51:30 Odin pveproxy[1381]: starting 1 worker(s)
Jul 1 00:51:30 Odin pveproxy[1381]: worker 110478 started
Jul 1 00:52:41 Odin pvedaemon[1375]: <root@pam> successful auth for user 'root@pam'
Jul 1 01:17:01 Odin CRON[114943]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jul 1 02:17:01 Odin CRON[125806]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jul 1 02:18:29 Odin smartd[1014]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 64 to 63
Jul 1 02:18:29 Odin smartd[1014]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 36 to 37
Jul 1 02:34:55 Odin pvedaemon[1375]: <root@pam> successful auth for user 'root@pam'
Jul 1 02:48:29 Odin smartd[1014]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 63 to 64
Jul 1 02:48:29 Odin smartd[1014]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 37 to 36
Jul 1 03:05:57 Odin systemd[1]: Starting Daily apt download activities...
Jul 1 03:05:58 Odin systemd[1]: apt-daily.service: Succeeded.
Jul 1 03:05:58 Odin systemd[1]: Finished Daily apt download activities.
Jul 1 03:08:24 Odin systemd[1]: session-1.scope: Succeeded.
Jul 1 03:08:34 Odin systemd[1]: Stopping User Manager for UID 0...
Jul 1 03:08:34 Odin systemd[2181]: Stopped target Main User Target.
Jul 1 03:08:34 Odin systemd[2181]: Stopped target Basic System.
Jul 1 03:08:34 Odin systemd[2181]: Stopped target Paths.
Jul 1 03:08:34 Odin systemd[2181]: Stopped target Sockets.
Jul 1 03:08:34 Odin systemd[2181]: Stopped target Timers.
Jul 1 03:08:34 Odin systemd[2181]: dirmngr.socket: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Closed GnuPG network certificate management daemon.
Jul 1 03:08:34 Odin systemd[2181]: gpg-agent-browser.socket: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Closed GnuPG cryptographic agent and passphrase cache (access for web browsers).
Jul 1 03:08:34 Odin systemd[2181]: gpg-agent-extra.socket: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Closed GnuPG cryptographic agent and passphrase cache (restricted).
Jul 1 03:08:34 Odin systemd[2181]: gpg-agent-ssh.socket: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Closed GnuPG cryptographic agent (ssh-agent emulation).
Jul 1 03:08:34 Odin systemd[2181]: gpg-agent.socket: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Closed GnuPG cryptographic agent and passphrase cache.
Jul 1 03:08:34 Odin systemd[2181]: Removed slice User Application Slice.
Jul 1 03:08:34 Odin systemd[2181]: Reached target Shutdown.
Jul 1 03:08:34 Odin systemd[2181]: systemd-exit.service: Succeeded.
Jul 1 03:08:34 Odin systemd[2181]: Finished Exit the Session.
Jul 1 03:08:34 Odin systemd[2181]: Reached target Exit the Session.
Jul 1 03:08:34 Odin systemd[1]: user@0.service: Succeeded.
Jul 1 03:08:34 Odin systemd[1]: Stopped User Manager for UID 0.
Jul 1 03:08:34 Odin systemd[1]: Stopping User Runtime Directory /run/user/0...
Jul 1 03:08:34 Odin systemd[1]: run-user-0.mount: Succeeded.
Jul 1 03:08:34 Odin systemd[1]: user-runtime-dir@0.service: Succeeded.
Jul 1 03:08:34 Odin systemd[1]: Stopped User Runtime Directory /run/user/0.
Jul 1 03:08:34 Odin systemd[1]: Removed slice User Slice of UID 0.
Jul 1 03:10:01 Odin CRON[136618]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Jul 1 03:17:01 Odin CRON[137857]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
 
I had similar issue few hours ago:

Jul 29 07:14:00 node2 systemd[1]: Started Proxmox VE replication runner.
Jul 29 07:15:00 node2 systemd[1]: Starting Proxmox VE replication runner...
Jul 29 07:15:00 node2 systemd[1]: Started Proxmox VE replication runner.
Jul 29 07:15:01 node2 CRON[382262]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jul 29 07:16:00 node2 systemd[1]: Starting Proxmox VE replication runner...
Jul 29 07:16:00 node2 systemd[1]: Started Proxmox VE replication runner.
Jul 29 07:17:00 node2 systemd[1]: Starting Proxmox VE replication runner...
Jul 29 07:17:00 node2 systemd[1]: Started Proxmox VE replication runner.
Jul 29 07:17:01 node2 CRON[382676]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$

Notice same minute and second in CRON log line.

In my case 3 nodes, running for 5 years without problems, and this morning all 3 nodes froze, with same line in log.
PVE v 5.4, updates where applied few months ago, so no changes in configuration in last days.

cron.hourly is empty, and when i manually run "cd / && run-parts --report /etc/cron.hourly" nothing is returned, and no error is displayed.

Any idea ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!