[SOLVED] PBS stops responding on update

May 16, 2020
261
16
38
51
Antwerp, Belgium
commandline.be
Running the latest PBS as a stand-alone I've ran into an issue then machine becomes unavailable.
Both console and network no longer respond. Sometimes console seems sluggish to respond, then stops.

My observation and impression is this is after proxmox-backup-daily-update ran.
To that end I disabled all of the below and re-enable the timers after 24 hours each.

proxmox-backup-daily-update.timer
apt-daily-upgrade.timer
man-db.timer

only proxmo-backup-daily-update.timer now remains disabled and the system appears to not hang anymore.
since trying out PBS I've not yet enabled a license for it.

is it possible the hang is related or did something else happen?

JL
 
Hi!
could post the syslog at the time of the issue (using journalctl)? Does the machine completely reboot/panic or does it just lag for a few minutes?
 
Hi!
could post the syslog at the time of the issue (using journalctl)? Does the machine completely reboot/panic or does it just lag for a few minutes?

Hey,

Though it seemed related to the update timers the system was found 'hung' this morning, it failed at 3:10 which is more or less the same time as before.

The machine simply stops responding to anything. Keyboard numlock led responds, no console shows anymore, network traffic stops, does not respond to ctrl-alt-del.

I've checked and tripple checked, there is nothing near or far in /var/log for syslog, daemon.log, kern.log,error Disabling update related timers did result in longer uptime.

Only current anomaly is e2scrub in /etc/crond.d which does not make sense to run on a ZFS filesystem? Now disabled the cronjob for e2scrub and also the service + timer. I've also disabled fstrim service and timer

For context, the system stayed up for multiple days when a firewall rule goof left it unreachable. Otherwise it tends to fail about every 24 hours.

Though this seemingly invalidates the below I'm trying out leaving e2scrub and fstrim disabled.

Now I notice repeat events of a backup up job running and finalising at 2:50 AM and the system "reliably failing" at 3:10AM The only anomalous activity at this time seemingly being for e2scrub related activity. There is no activity in any log after 3:10AM

I've now added

systemctl enable zfs-trim-weekly@zpool.timer --now

update: eventually it was found the last syslog was at 3:14 AM and this was for the proxmox-backup /api showing a 200 for a ticket url.

br,

JL
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!