PBS remote Sync job hangs if cifs share is not available on start of scheduled task

dierochade

New Member
Apr 20, 2024
5
0
1
I have a second datastore that is mounted into pbs via fstab:

//ip.xxx.xxx.xxx/pbs /mnt/rechenrhino cifs credentials=/root/.smbcredentials/rechenrhino,uid=backup,gid=backup,vers=3.0,file_mode=0770,dir_mode=0770,nounix,noserverino,cache=none,nofail,x-systemd.automount,x-systemd.idle-timeout=600 0

This share is on a desktop windows machine that I use to duplicate my backups via a scheduled remote sync job.

This seems to work as expected, if not the share is not available when the job is started. Then it hangs unfinished and infinitely, even if the mountpoint comes online.

The job log looks like this:

2025-04-18T12:45:00+02:00: Starting datastore sync job '-:localPBS:rechenrhinoPBS::s-793fadcc-01b6'
2025-04-18T12:45:00+02:00: task triggered by schedule '12:45'
2025-04-18T12:45:00+02:00: sync datastore 'rechenrhinoPBS' from 'localPBS'
2025-04-18T12:45:00+02:00: ----
2025-04-18T12:45:00+02:00: Syncing datastore 'localPBS', root namespace into datastore 'rechenrhinoPBS', root namespace
2025-04-18T12:45:00+02:00: found 12 groups to sync (out of 12 total)
2025-04-18T12:45:20+02:00: sync group ct/101 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:45:20+02:00: create_locked_backup_group failed
2025-04-18T12:45:40+02:00: sync group ct/102 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:45:40+02:00: create_locked_backup_group failed
2025-04-18T12:46:01+02:00: sync group ct/108 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:46:01+02:00: create_locked_backup_group failed
2025-04-18T12:46:21+02:00: sync group ct/109 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:46:21+02:00: create_locked_backup_group failed
2025-04-18T12:46:42+02:00: sync group ct/111 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:46:42+02:00: create_locked_backup_group failed
2025-04-18T12:47:02+02:00: sync group ct/900 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:47:02+02:00: create_locked_backup_group failed
2025-04-18T12:47:23+02:00: sync group ct/1005 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:47:23+02:00: create_locked_backup_group failed
2025-04-18T12:47:43+02:00: sync group vm/103 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:47:43+02:00: create_locked_backup_group failed
2025-04-18T12:48:04+02:00: sync group vm/104 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:48:04+02:00: create_locked_backup_group failed
2025-04-18T12:48:24+02:00: sync group vm/105 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:48:24+02:00: create_locked_backup_group failed
2025-04-18T12:48:45+02:00: sync group vm/110 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:48:45+02:00: create_locked_backup_group failed
2025-04-18T12:49:05+02:00: sync group vm/116 failed - group lock failed: Host is down (os error 112)
2025-04-18T12:49:05+02:00: create_locked_backup_group failed
2025-04-18T12:49:05+02:00: Finished syncing root namespace, current progress: 11 groups, 0 snapshots

As you can see I run the job daily and it keeps all jobs open for the past days (was not at home, so machine was down all the time):
FireShot Capture 066 - pbs - Proxmox Backup Server - [pbs.zeeb24.de].png

I cant even manually stop the job from the gui by pressing the stop button, it just doesnt do anything. In this situation a manual trigger of the sync job doesnt go on either. After a reboot the jobs are gone and everything back to normal.
 
I do understand that it fails. But a mount point not accessible is nothing unusual, therefore an issue that the software can handle and should? Just throw a proper error and end the task. It’s a bug imo.