PBS: every 3 days PBS become unstable

aabye

New Member
Nov 19, 2024
6
0
1
Hi guys.

I have problem with my primary PBS in production environment.

Approx. every third day server become unresponsive and crashes. In error log there is nothing useful, here is from yesterday:

Jan 09 23:17:37 backup1 proxmox-backup-proxy[1560]: Upload backup log to datastore 'BACKUP1', root namespace vm/169/2026-01-09T22:17:27Z/client.log.blob
Jan 09 23:25:11 backup1 proxmox-backup-proxy[1560]: Upload backup log to datastore 'BACKUP1', root namespace vm/170/2026-01-09T22:17:37Z/client.log.blob
Jan 09 23:25:25 backup1 proxmox-backup-proxy[1560]: Upload backup log to datastore 'BACKUP1', root namespace vm/171/2026-01-09T22:25:11Z/client.log.blob
Jan 09 23:25:33 backup1 proxmox-backup-proxy[1560]: Upload backup log to datastore 'BACKUP1', root namespace vm/176/2026-01-09T22:25:25Z/client.log.blob
Jan 09 23:46:03 backup1 proxmox-backup-proxy[1560]: rrd journal successfully committed (52 files in 0.069 seconds)
Jan 10 00:00:01 backup1 proxmox-backup-proxy[1560]: rotated access log, telling daemons to re-open log file
Jan 10 00:00:01 backup1 proxmox-backup-proxy[1560]: rotated auth log, telling daemons to re-open log file
Jan 10 00:00:01 backup1 proxmox-backup-proxy[1560]: re-opening access-log file
Jan 10 00:00:01 backup1 proxmox-backup-api[1406]: re-opening access-log file
Jan 10 00:00:01 backup1 proxmox-backup-proxy[1560]: re-opening auth-log file
Jan 10 00:00:01 backup1 proxmox-backup-api[1406]: re-opening auth-log file
Jan 10 00:00:19 backup1 proxmox-backup-proxy[1560]: GET /api2/json/nodes/localhost/tasks?running=1&limit=100: 400 Bad Request: [client [::ffff:]:6012] Unable to acquire lock "/var/log/proxmox-backup/tasks/.acti>
Jan 10 00:00:20 backup1 proxmox-backup-proxy[1560]: error writing task result to the tasklog
Jan 10 00:00:22 backup1 proxmox-backup-proxy[1560]: error writing task result to the tasklog
Jan 10 00:00:25 backup1 proxmox-backup-proxy[1560]: GET /api2/json/nodes/localhost/tasks?limit=0&since=1765405318: 400 Bad Request: [client [::ffff:]:47818] Unable to acquire lock "/var/log/proxmox-backup/tasks>

Jan 10 00:00:25 backup1 systemd[1]: Starting dpkg-db-backup.service - Daily dpkg database backup service...
Jan 10 00:00:26 backup1 systemd[1]: dpkg-db-backup.service: Deactivated successfully.
Jan 10 00:00:26 backup1 systemd[1]: Finished dpkg-db-backup.service - Daily dpkg database backup service.
Jan 10 00:00:35 backup1 proxmox-backup-proxy[1560]: GET //api2/json/backup?backup-id=128&backup-time=1767999604&backup-type=vm&benchmark=false&debug=false&store=BACKUP1: 400 Bad Request: [client [::ffff:]:3569>
Jan 10 00:00:37 backup1 proxmox-backup-proxy[1560]: GET //api2/json/backup?backup-id=147&backup-time=1767999602&backup-type=vm&benchmark=false&debug=false&store=BACKUP2: 400 Bad Request: [client [::ffff:]:5154>
Jan 10 00:01:37 backup1 proxmox-backup-proxy[1560]: GET /api2/json/nodes/localhost/tasks?running=1&limit=100: 400 Bad Request: [client [::ffff:]:19404] Unable to acquire lock "/var/log/proxmox-backup/tasks/.act>
Jan 10 00:01:41 backup1 proxmox-backup-proxy[1560]: GET /api2/json/nodes/localhost/tasks?limit=0&since=1765405318: 400 Bad Request: [client [::ffff:]:5751] Unable to acquire lock "/var/log/proxmox-backup/tasks/>
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: server error: Transport endpoint is not connected (os error 107)
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: request_shutdown
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: daemon shutting down...
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: daemon shut down.

Jan 10 00:02:43 backup1 proxmox-backup-proxy[1560]: POST /api2/json/access/ticket: 400 Bad Request: [client [::ffff:]:34634] client error (SendRequest)
Jan 10 00:02:43 backup1 proxmox-backup-proxy[1560]: POST /api2/json/access/ticket: 400 Bad Request: [client [::ffff:]:41306] client error (SendRequest)
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: server shutting down, waiting for active workers to complete
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: done - exit server
Jan 10 00:02:43 backup1 proxmox-backup-proxy[1560]: POST /api2/json/access/ticket: 400 Bad Request: [client [::ffff:]:43060] client error (SendRequest)
Jan 10 00:02:43 backup1 proxmox-backup-proxy[1560]: POST /api2/json/access/ticket: 400 Bad Request: [client [::ffff:]:37144] client error (SendRequest)
Jan 10 00:02:43 backup1 proxmox-backup-proxy[1560]: POST /api2/json/access/ticket: 400 Bad Request: [client [::ffff:]:38492] client error (SendRequest)



I have everytime manually reboot server (server is physical) and from reboot on everything is working well for next 3 days.

On PVE there are such errors duging problem in time when PBS is unresponsible.

TASK ERROR: could not activate storage 'backup2': backup2: error fetching datastores - 400 Bad Request

I belive there is something with API servis but I cannot find anything in logs.

Please help.

Environment is up to date, PBS 4.1.1 with all patches. We do not have subscription yet but in future there will be.

Thank you for help.

Aabye
 
Are the datastores local or remote? Every 3 days sounds like a prune or verify job which then runs into a timeout/lock.
 
Within the console:

Code:
proxmox-backup-manager task list | egrep "verify|prune|gc|sync"

For system logs:

Code:
journalctl -u proxmox-backup -u proxmox-backup-api -b -1 --no-pager

Kernel logs (I/O errors):

Code:
journalctl -k -b -1 --no-pager

How is your local storage configured? ZFS RAID? Ext4? SSD/NVME or HDD? The job itself should not be a problem. Prune, verify and other jobs are standard tasks. I guess it‘s a problem with the underlying storage system (faulty or slow drive, controller, cable, etc.). Another possibility (if ZFS is used): ARC/RAM problems (ARC size to high for example).
 
Last edited:
HI.

Thank you.

task list is empty:

root@backup1:/var/log/proxmox-backup/tasks# proxmox-backup-manager task list | egrep "verify|prune|gc|sync"
root@backup1:/var/log/proxmox-backup/tasks#
root@backup1:/var/log/proxmox-backup/tasks#
root@backup1:/var/log/proxmox-backup/tasks#
root@backup1:/var/log/proxmox-backup/tasks#

journalctl shows similar as logs before in my previous reply:

...
root@backup1:/var/log/proxmox-backup/tasks# journalctl -u proxmox-backup -u proxmox-backup-api -b -1 --no-pager
...
Jan 09 21:17:42 backup1 proxmox-backup-api[1406]: notified via target `mail-to-root`
Jan 10 00:00:01 backup1 proxmox-backup-api[1406]: re-opening access-log file
Jan 10 00:00:01 backup1 proxmox-backup-api[1406]: re-opening auth-log file
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: server error: Transport endpoint is not connected (os error 107)
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: request_shutdown
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: daemon shutting down...
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: daemon shut down.
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: server shutting down, waiting for active workers to complete
Jan 10 00:02:43 backup1 proxmox-backup-api[1406]: done - exit server
Jan 10 00:02:43 backup1 systemd[1]: proxmox-backup.service: Deactivated successfully.

journalctl:
....
Jan 08 00:13:06 backup1 kernel: hpsa 0000:03:00.0: Using 48-bit DMA addresses
Jan 10 00:21:39 backup1 kernel: XFS (sdb1): Unmounting Filesystem 1d8f08ac-1189-49ad-bc86-611d253ecbb8
Jan 10 00:21:39 backup1 systemd-shutdown[1]: Syncing filesystems and block devices.
Jan 10 00:21:40 backup1 systemd-shutdown[1]: Sending SIGTERM to remaining processes...
....


local storage (there are two) is ext4, second xfs, 8TB in size. Behind is RAID1, hardware, no errors seen there. And disks are HDD.

Same configuration was working before on previous version PBS (3.*), after upgrade to 4.* I having problems.
 
Last edited: