PVE can't backup to PBS with datastore on SMB share

so this sequence here should represent one of the failed backups:
Code:
mkdir("/mnt/backup/vm", 0777)           = 0
mkdir("/mnt/backup/vm/101", 0777)       = 0
openat(AT_FDCWD, "/mnt/backup/vm/101", O_RDONLY) = 26
openat(AT_FDCWD, "/mnt/backup/vm/101/owner", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 27
openat(AT_FDCWD, "/mnt/backup/vm/101/owner", O_RDONLY|O_CLOEXEC) = 27
openat(AT_FDCWD, "/mnt/backup/vm/101", O_RDONLY) = 27
mkdir("/mnt/backup/vm/101/2021-04-20T12:55:07Z", 0777) = 0
openat(AT_FDCWD, "/mnt/backup/vm/101/2021-04-20T12:55:07Z", O_RDONLY) = 27
openat(AT_FDCWD, "/proc/836/stat", O_RDONLY|O_CLOEXEC) = 28
statx(28, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0444, stx_size=0, ...}) = 0
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 28
openat(AT_FDCWD, "/", O_RDONLY|O_DIRECTORY) = 28
mkdirat(28, "var", 0755)                = -1 EEXIST (File exists)
openat(28, "var", O_RDONLY|O_DIRECTORY) = 29
mkdirat(29, "log", 0755)                = -1 EEXIST (File exists)
openat(29, "log", O_RDONLY|O_DIRECTORY) = 28
mkdirat(28, "proxmox-backup", 0755)     = -1 EEXIST (File exists)
openat(28, "proxmox-backup", O_RDONLY|O_DIRECTORY) = 29
mkdirat(29, "tasks", 0755)              = -1 EEXIST (File exists)
openat(29, "tasks", O_RDONLY|O_DIRECTORY) = 28
mkdirat(28, "88", 0755)                 = -1 EEXIST (File exists)
openat(28, "88", O_RDONLY|O_DIRECTORY)  = 29
openat(AT_FDCWD, "/var/log/proxmox-backup/tasks/88/UPID:pbs:00000344:00000188:00000008:607ECF2C:backup:backup\\x3avm-101:christoph@pbs:", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 28
chown("/var/log/proxmox-backup/tasks/88/UPID:pbs:00000344:00000188:00000008:607ECF2C:backup:backup\\x3avm-101:christoph@pbs:", 34, 34) = 0
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 29
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 29
openat(AT_FDCWD, "/var/log/proxmox-backup/tasks/.active.lock", O_WRONLY|O_CREAT|O_APPEND|O_CLOEXEC, 0666) = 29
chown("/var/log/proxmox-backup/tasks/.active.lock", 34, 34) = 0
openat(AT_FDCWD, "/var/log/proxmox-backup/tasks/index", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/log/proxmox-backup/tasks/active", O_RDONLY|O_CLOEXEC) = 30
openat(AT_FDCWD, "/var/log/proxmox-backup/tasks/active.tmp_b0GbBw", O_RDWR|O_CREAT|O_EXCL, 0600) = 30
rename("/var/log/proxmox-backup/tasks/active.tmp_b0GbBw", "/var/log/proxmox-backup/tasks/active") = 0
openat(AT_FDCWD, "/mnt/backup/vm/101/2021-04-20T12:55:07Z/drive-scsi0.img.tmp_fidx", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 29
access("/dev/random", R_OK)             = 0
statx(AT_FDCWD, "/mnt/backup/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7f772b9f98a0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/mnt/backup/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = -1 ENOENT (No such file or directory)

you can see how it
  1. checks the owner successfully (mkdir/openat)
  2. starts the task status/log files and snapshot directory successfully (mkdirat/openat/chown/rename)
  3. opens the tmp file for the index successfully (openat .... .tmp_fidx)
  4. tries to see if the first chunk exists (statx - result is ENOENT, which is okay in this case)
  5. tries to open the tmp file for writing that chunk (openat - result is ENOENT, which is an error, see below)
for the last openat, we get an unexpected ENOENT, for which the docs say:
Code:
man openat
...
ENOENT O_CREAT is not set and the named file does not exist.       
ENOENT A directory component in pathname does not exist or is a dangling symbolic link.
ENOENT pathname refers to a nonexistent directory, O_TMPFILE and one of O_WRONLY or O_RDWR were specified in flags, but this kernel version does not provide the O_TMPFILE functionality.
...

the first one does not apply, since we set O_CREAT. the last one does not apply since we haven't set O_TMPFILE. that leaves the middle one - either the .chunks or the bb9f sub directory must be broken somehow - /mnt/backup itself should be okay, we access that earlier in the strace without any problems.

can you try "touching" a path under /mnt/backup/.chunks/bb9f as user and group backup?
 
can you try "touching" a path under /mnt/backup/.chunks/bb9f as user and group backup?
Touching a file seems to work without any problems.
I gotta say it's slow, but I guess that's just because it's an SMB share and right now there's an rsync job running for some data on the same hard disk.

Bash:
cd /mnt/backup/.chunks/bb9f
sudo -u backup touch testing

ls -a
drwxr-xr-x 2 backup backup 0 Apr 21 11:09 .
drwxr-xr-x 2 backup backup 0 Apr 18 00:28 ..
-rwxr-xr-x 1 backup backup 0 Apr 21 11:09 testing

Also, thank you very much for the detailed explanation :)
How did you get the man-page for openat?
Neither one of my Proxmox VE nor Proxmox BS have this manpage.
I found it via Google, but shouldn't it be working with man too?
 
I found this.
This is my daily backup-schedule kicking in.

Code:
sudo journalctl --since today -u proxmox-backup-proxy

Apr 21 00:00:03 pbs proxmox-backup-proxy[836]: starting new backup on datastore 'backup': "vm/101/2021-04-20T22:00:02Z"
Apr 21 00:00:03 pbs proxmox-backup-proxy[836]: GET /previous: 400 Bad Request: no valid previous backup
Apr 21 00:00:04 pbs proxmox-backup-proxy[836]: created new fixed index 1 ("vm/101/2021-04-20T22:00:02Z/drive-scsi0.img.fidx")
Apr 21 00:00:05 pbs proxmox-backup-proxy[836]: POST /fixed_chunk: 400 Bad Request: No such file or directory (os error 2)
Apr 21 00:00:05 pbs proxmox-backup-proxy[836]: backup ended and finish failed: backup ended but finished flag is not set.
Apr 21 00:00:05 pbs proxmox-backup-proxy[836]: removing unfinished backup
Apr 21 00:00:05 pbs proxmox-backup-proxy[836]: removing backup snapshot "/mnt/backup/vm/101/2021-04-20T22:00:02Z"
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: TASK ERROR: removing backup snapshot "/mnt/backup/vm/101/2021-04-20T22:00:02Z" failed - Directory not empty (os error 39)
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: starting new backup on datastore 'backup': "vm/102/2021-04-20T22:00:05Z"
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: GET /previous: 400 Bad Request: no valid previous backup
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: created new fixed index 1 ("vm/102/2021-04-20T22:00:05Z/drive-scsi0.img.fidx")
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: POST /fixed_chunk: 400 Bad Request: No such file or directory (os error 2)
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: backup ended and finish failed: backup ended but finished flag is not set.
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: removing unfinished backup
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: removing backup snapshot "/mnt/backup/vm/102/2021-04-20T22:00:05Z"
Apr 21 00:00:06 pbs proxmox-backup-proxy[836]: TASK ERROR: removing backup snapshot "/mnt/backup/vm/102/2021-04-20T22:00:05Z" failed - Directory not empty (os error 39)

So it complains about 2 directories not being empty.
According to ls -la both directories are empty, though.

I deleted the directory /mnt/backup/vm, to "start fresh".
Then i started the backup again (pressing on the RUN NOW button in Proxmox VE) and watched the journal.
To me it seems like the same errors happen again.
Code:
sudo journalctl -ef --since today

Apr 21 12:42:12 pbs proxmox-backup-proxy[836]: starting new backup on datastore 'backup': "vm/101/2021-04-21T10:42:11Z"
Apr 21 12:42:12 pbs proxmox-backup-proxy[836]: GET /previous: 400 Bad Request: no valid previous backup
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: created new fixed index 1 ("vm/101/2021-04-21T10:42:11Z/drive-scsi0.img.fidx")
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: POST /fixed_chunk: 400 Bad Request: No such file or directory (os error 2)
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: backup ended and finish failed: backup ended but finished flag is not set.
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: removing unfinished backup
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: removing backup snapshot "/mnt/backup/vm/101/2021-04-21T10:42:11Z"
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: TASK ERROR: removing backup snapshot "/mnt/backup/vm/101/2021-04-21T10:42:11Z" failed - Directory not empty (os error 39)
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: starting new backup on datastore 'backup': "vm/102/2021-04-21T10:42:13Z"
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: GET /previous: 400 Bad Request: no valid previous backup
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: created new fixed index 1 ("vm/102/2021-04-21T10:42:13Z/drive-scsi0.img.fidx")
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: POST /fixed_chunk: 400 Bad Request: No such file or directory (os error 2)
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: backup ended and finish failed: backup ended but finished flag is not set.
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: removing unfinished backup
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: removing backup snapshot "/mnt/backup/vm/102/2021-04-21T10:42:13Z"
Apr 21 12:42:13 pbs proxmox-backup-proxy[836]: TASK ERROR: removing backup snapshot "/mnt/backup/vm/102/2021-04-21T10:42:13Z" failed - Directory not empty (os error 39)

My automatic verify job also failed (before I deleted the vm folder).
It complains about not finding the index, which is not there because the directories are always empty.
Code:
2021-04-21T12:00:00+02:00: Starting datastore verify job 'backup:v-70c2cdd6-e15b'
2021-04-21T12:00:00+02:00: task triggered by schedule '12:00'
2021-04-21T12:00:00+02:00: verify datastore backup
2021-04-21T12:00:01+02:00: found 2 groups
2021-04-21T12:00:01+02:00: verify group backup:vm/101 (1 snapshots)
2021-04-21T12:00:01+02:00: verify backup:vm/101/2021-04-20T22:00:02Z - manifest load error: unable to load blob '"/mnt/backup/vm/101/2021-04-20T22:00:02Z/index.json.blob"' - No such file or directory (os error 2)
2021-04-21T12:00:01+02:00: percentage done: 50.00% (0 of 2 groups, 1 of 1 group snapshots)
2021-04-21T12:00:01+02:00: verify group backup:vm/102 (1 snapshots)
2021-04-21T12:00:01+02:00: verify backup:vm/102/2021-04-20T22:00:05Z - manifest load error: unable to load blob '"/mnt/backup/vm/102/2021-04-20T22:00:05Z/index.json.blob"' - No such file or directory (os error 2)
2021-04-21T12:00:01+02:00: percentage done: 100.00% (1 of 2 groups, 1 of 1 group snapshots)
2021-04-21T12:00:01+02:00: Failed to verify the following snapshots/groups:
2021-04-21T12:00:01+02:00:     vm/101/2021-04-20T22:00:02Z
2021-04-21T12:00:01+02:00:     vm/102/2021-04-20T22:00:05Z
2021-04-21T12:00:01+02:00: TASK ERROR: verification failed - please check the log for details
 
the sequence is
- directory not empty, thus not removed
- directory looks like a snapshot, but does not contain a manifest, thus cannot be verified (it should have been removed directly when the backup failed, but that was not possible)

not sure whether the "directory not empty" has the same cause as the "can't create chunk file" or is a separate issue (we had some in the past where locking the snapshot dir was problematic).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!