Proxmox Scheduler issue since last update

DezzyAU

New Member
Jul 18, 2025
9
0
1
Hi all,

I'm having an issue with my proxmox scheduler service since the last update. Any chance someone is able to assist.
These are the errors with the "Syslog" when i attempt to start the service manually. This is a single node so there is no cluster setup. I've tried a few things that other posts have suggested and i'm still getting nothing. If i manually run my backups they work perfectly fine.

Code:
Jul 11 15:51:14 Prox1 pvescheduler[3587167]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:52:14 Prox1 pvescheduler[3587725]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:52:14 Prox1 pvescheduler[3587726]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:53:14 Prox1 pvescheduler[3588282]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:53:14 Prox1 pvescheduler[3588281]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:54:14 Prox1 pvescheduler[3588790]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:54:14 Prox1 pvescheduler[3588789]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:55:14 Prox1 pvescheduler[3589363]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:55:14 Prox1 pvescheduler[3589362]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:56:14 Prox1 pvescheduler[3589936]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:56:14 Prox1 pvescheduler[3589935]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 11 15:57:14 Prox1 pvescheduler[3590481]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 11 15:57:14 Prox1 pvescheduler[3590480]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
 
This was the last successful backup that run and ever since then i've received the above error.
It says there was low space but i've df -h and no mount is out of space.

Code:
Jul 07 21:00:00 Prox1 pvescheduler[1908939]: <root@pam> starting task UPID:Prox1:001D20CC:0B551D09:686BA8B0:vzdump::root@pam:
Jul 07 21:00:00 Prox1 pvescheduler[1908940]: INFO: starting new backup job: vzdump 117 114 122 105 112 --notification-mode legacy-sendmail --storage local --mode snapshot --mailto e-mail --prune-backups 'keep-daily=3' --notes-template '{{guestname}} {{vmid}} Nightly' --compress zstd --fleecing 0 --quiet 1 --mailnotification always
Jul 07 21:00:00 Prox1 pvescheduler[1908940]: INFO: Starting Backup of VM 105 (qemu)
Jul 07 21:00:44 Prox1 pvescheduler[1908940]: INFO: Finished Backup of VM 105 (00:00:44)
Jul 07 21:00:44 Prox1 pvescheduler[1908940]: INFO: Starting Backup of VM 112 (qemu)
Jul 07 21:07:00 Prox1 pvescheduler[1908940]: INFO: Finished Backup of VM 112 (00:06:16)
Jul 07 21:07:00 Prox1 pvescheduler[1908940]: INFO: Starting Backup of VM 114 (qemu)
Jul 07 21:19:19 Prox1 pvescheduler[1908940]: unable to delete old temp file: Input/output error
Jul 07 21:19:19 Prox1 pvescheduler[1908940]: ERROR: Backup of VM 114 failed - vma_queue_write: write error - Broken pipe
Jul 07 21:19:19 Prox1 pvescheduler[1908940]: ERROR: Backup of VM 117 failed - unable to create temporary directory '/var/lib/vz/dump/vzdump-qemu-117-2025_07_07-21_19_19.tmp' at /usr/share/perl5/PVE/VZDump.pm line 1054.
Jul 07 21:19:19 Prox1 pvescheduler[1908940]: ERROR: Backup of VM 122 failed - unable to create temporary directory '/var/lib/vz/dump/vzdump-qemu-122-2025_07_07-21_19_19.tmp' at /usr/share/perl5/PVE/VZDump.pm line 1054.
Jul 07 21:19:19 Prox1 pvescheduler[1908940]: INFO: Backup job finished with errors
Jul 07 21:19:19 Prox1 postfix/postdrop[1922679]: warning: mail_queue_enter: create file maildrop/158846.1922679: No space left on device
 
Now PVE Manager wont updat/start because of this. Any Help Please before i blow my instance away and start again.
 
Your device is full! No space left on device That is the reason!
 
Your device is full! No space left on device That is the reason!
Thanks. I've had several people look at this and every device has over 200-400 GB Free. We've since discovered that pveScheduler is actually trying to startup and re-run a failed/missed backup. But because this never succeeds the service startup "Fails" and i end up in the same loop again trying to get the service started.
 
The scheduler only starts jobs after it has start up completely, else it would always be killed by the systemd service start timeout for any backup thant needs more than a few minutes, which are most.

As postfix–a completely unrelated service to the scheduler–on the host system also complains about not being able to create files I'd highly doubt that your root filesystem is working fine and/or has space left.

For starters please post the output of df -h
 
I've now fresh installed and i've got the same issue. It looks like there is a bug in the most recent update. :( Hopefully they can fix it because when the scheduler service was updated trying to install pve manager continuously failed to install and broke my install.
 
All error messages we saw pointed at another root cause, and FWIW, we do not have a single other report about the last update breaking the scheduler, while in theory it naturally can be some issue specific to your setup I rather find that unlikely.

Please post the output of df -h

And attach the journal.log.zst file generated by the following command:
journalctl --no-hostname -o short-precise -b | zstd >journal.log.zst

Above will help to advance the investigation whatever the root cause is.
 
Code:
udev                          567G     0  567G   0% /dev
tmpfs                         114G   21M  114G   1% /run
rpool/ROOT/pve-1              675G  2.3G  673G   1% /
tmpfs                         567G   31M  567G   1% /dev/shm
tmpfs                         5.0M     0  5.0M   0% /run/lock
efivarfs                      512K   50K  458K  10% /sys/firmware/efi/efivars
rpool                         673G  128K  673G   1% /rpool
rpool/var-lib-vz              673G  121M  673G   1% /var/lib/vz
Primary                        17T  384K   17T   1% /Primary
rpool/ROOT                    673G  128K  673G   1% /rpool/ROOT
rpool/data                    673G  128K  673G   1% /rpool/data
Primary/subvol-122-disk-0     100G  3.1G   97G   4% /Primary/subvol-122-disk-0
Primary/subvol-101-disk-0     200G   36G  165G  18% /Primary/subvol-101-disk-0
Primary/subvol-118-disk-0     250G  618M  250G   1% /Primary/subvol-118-disk-0
Primary/subvol-116-disk-0     200G  3.7G  197G   2% /Primary/subvol-116-disk-0
/dev/fuse                     128M   48K  128M   1% /etc/pve
//10.100.0.100/ProxmoxBackup   12T  8.1T  3.6T  70% /mnt/pve/ProxmoxBackup
//10.100.0.100/Prox10Backup   1.3T  853G  472G  65% /mnt/pve/Prox10Backup
tmpfs                         114G     0  114G   0% /run/user/0

Fresh install. Same Problem. The only thing i imported was the SDN/s and my VM lxc Config. Everything else from the previous host (Except IP) was changed
 

Attachments

Last edited:
I'm going to go back to basics and configure the server from scratch. It's going to take me a whilst given i've got so much data to backup.
 
For some odd reason. Whilst cleaning up the server and Backing up and deleting VM's as I back them up the PVE scheduler has just started working and is reliably starting/restarting for me. Not sure what the go is here But i'm going to continue on as I imported my ZFS pool and i've lost access as root to manage it. Plus i didn't enable "Thin" provision when i built it so all my VM's are fat.
 
Last edited:
For those interested. Proxmox wasn't cleaning out the old backups before it was performing new ones.

In tern My backup location was running out of space and causing the scheduler to fail because i had "Missed Backups" turned on and it was trying to backup to a location that didn't have enough space.

Would have been nice if the error/debugger told us that the backup location had no space because it didn't purge the older backups first.
 
that should be visible in the backup task logs (which should have failed as a result?) and in the storage overview as well..