Unable to delete backups or prune when storage is on CIFS mount

chotaire

Well-Known Member
Dec 25, 2019
112
36
48
When a PBS storage is created on a CIFS mount it will be impossible to delete backups or prune. PBS is installed on a separate host from PVE, the versions in use are:

Code:
PBS Version:

proxmox-backup: 1.0-4 (running kernel: 5.4.78-1-pve)
proxmox-backup-server: 1.0.5-1 (running version: 1.0.5)
pve-kernel-5.4: 6.3-2
pve-kernel-helper: 6.3-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
ifupdown2: not correctly installed
libjs-extjs: 6.0.1-10
proxmox-backup-docs: 1.0.4-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-xtermjs: 4.7.0-3 smartmontools: 7.1-
pve2 zfsutils-linux: 0.8.5-pve1

PVE Version:

proxmox-ve: 6.3-1 (running kernel: 5.4.78-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-2
pve-kernel-helper: 6.3-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: not correctly installed
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-1
libpve-common-perl: 6.3-1
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-2
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1 vncterm: 1.6-2

I created a CIFS systemd automount:

Code:
# cat /etc/systemd/system/mnt-storage.mount

[Unit]
Description=CIFS mount from Hetzner Storagebox

[Mount]
What=//1.2.3.4/backup/pbs/
Where=/mnt/storage
Options=username=u123456,password=123456,rw,uid=34,noforceuid,gid=34,noforcegid
Type=cifs

[Install]
WantedBy=multi-user.target

Code:
# cat mnt-storage.automount

[Unit]
Description=Automount /mnt/storage
After=network-online.target
Wants=network-online.target

[Automount]
Where=/mnt/storage
TimeoutIdleSec=10min

[Install]
WantedBy=multi-user.target

Initializing the storage takes about 10 minutes, results:

Code:
# proxmox-backup-manager datastore show storage
┌────────────────┬──────────────┐
│ Name           │ Value        │
╞════════════════╪══════════════╡
│ name           │ storage      │
├────────────────┼──────────────┤
│ path           │ /mnt/storage │
├────────────────┼──────────────┤
│ comment        │              │
├────────────────┼──────────────┤
│ gc-schedule    │ 2:30         │
├────────────────┼──────────────┤
│ keep-daily     │ 7            │
├────────────────┼──────────────┤
│ keep-monthly   │ 6            │
├────────────────┼──────────────┤
│ keep-weekly    │ 4            │
├────────────────┼──────────────┤
│ prune-schedule │ 1:30         │
└────────────────┴──────────────┘

I add PBS as storage to PVE and run a test backup:

Code:
INFO: starting new backup job: vzdump 114 --node test --storage pbs --mode snapshot --remove 0
INFO: Starting Backup of VM 114 (qemu)
INFO: Backup started at 2020-12-03 17:57:39
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: meet
INFO: include disk 'scsi0' 'thin:vm-114-disk-0' 5G
INFO: creating Proxmox Backup Server archive 'vm/114/2020-12-03T16:57:39Z'
INFO: starting kvm to execute backup task
INFO: enabling encryption
INFO: started backup task 'bc9c9871-460b-462b-9181-c1de9f63acbf'
INFO: scsi0: dirty-bitmap status: created new
INFO:  20% (1.0 GiB of 5.0 GiB) in  3s, read: 341.3 MiB/s, write: 176.0 MiB/s
INFO:  21% (1.1 GiB of 5.0 GiB) in  6s, read: 26.7 MiB/s, write: 25.3 MiB/s
<...>
INFO:  97% (4.9 GiB of 5.0 GiB) in  2m  7s, read: 26.7 MiB/s, write: 25.3 MiB/s
INFO:  99% (5.0 GiB of 5.0 GiB) in  2m 10s, read: 32.0 MiB/s, write: 32.0 MiB/s
INFO: 100% (5.0 GiB of 5.0 GiB) in  2m 13s, read: 14.7 MiB/s, write: 14.7 MiB/s
INFO: backup is sparse: 1.14 GiB (22%) total zero data
INFO: backup was done incrementally, reused 1.14 GiB (22%)
INFO: transferred 5.00 GiB in 143 seconds (35.8 MiB/s)
INFO: stopping kvm after backup task
INFO: Finished Backup of VM 114 (00:02:25)
INFO: Backup finished at 2020-12-03 18:00:04
INFO: Backup job finished successfully
TASK OK

Now, when trying to remove a backup from either PVE or PBS, the following critical issue happens:

Code:
proxmox-backup-client failed: Error: removing backup snapshot "/mnt/storage/vm/114/2020-12-03T16:57:39Z" failed - Directory not empty (os error 39) at /usr/share/perl5/PVE/API2/Storage/Content.pm line 458. (500)

The following is logged on PBS:

Code:
Dec  3 18:26:47 pbs proxmox-backup-proxy[746]: DELETE /api2/json/admin/datastore/storage/snapshots?backup-id=114&backup-time=1607014659&backup-type=vm: 400 Bad Request: [client [::ffff:1.2.3.5]:46804] removing backup snapshot "/mnt/storage/vm/114/2020-12-03T16:57:39Z" failed - Directory not empty (os error 39)
Dec  3 18:26:48 pbs proxmox-backup-proxy[746]: error during snapshot file listing: 'unable to load blob '"/mnt/storage/vm/114/2020-12-03T16:57:39Z/index.json.blob"' - No such file or directory (os error 2)'

Inspecting the directory /mnt/storage/vm/114/2020-12-03T16:57:39Z shows that the directory is now actually empty:

Code:
# cd /mnt/storage/vm/114/2020-12-03T16:57:39Z/
# ls -al
total 0
drwxr-xr-x 2 backup backup 0 Dec  3 18:26 .
drwxr-xr-x 2 backup backup 0 Dec  3 17:57 ..

I would be able to rmdir the directory, which verifies there are no locks that would keep the directory from being deleted:

Code:
# rmdir /mnt/storage/vm/114/2020-12-03T16:57:39Z/
#

While this removes the backup from both the PBS and PVE GUI, this would lead to a fatal inconsistent state of the entire backup group, because any further backups would still be created incrementally.

Code:
INFO: starting new backup job: vzdump 114 --storage pbs --node test --mode snapshot --remove 0
INFO: Starting Backup of VM 114 (qemu)
INFO: Backup started at 2020-12-03 18:36:26
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: meet
INFO: include disk 'scsi0' 'thin:vm-114-disk-0' 5G
INFO: creating Proxmox Backup Server archive 'vm/114/2020-12-03T17:36:26Z'
INFO: starting kvm to execute backup task
INFO: enabling encryption
INFO: started backup task '80c3acc3-59b2-48b0-ae71-c7907936211d'
INFO: scsi0: dirty-bitmap status: created new
INFO:  33% (1.7 GiB of 5.0 GiB) in  3s, read: 566.7 MiB/s, write: 373.3 MiB/s
INFO:  53% (2.7 GiB of 5.0 GiB) in  6s, read: 350.7 MiB/s, write: 324.0 MiB/s
INFO:  80% (4.0 GiB of 5.0 GiB) in  9s, read: 457.3 MiB/s, write: 340.0 MiB/s
INFO: 100% (5.0 GiB of 5.0 GiB) in 12s, read: 332.0 MiB/s, write: 278.7 MiB/s
INFO: backup is sparse: 1.14 GiB (22%) total zero data
INFO: backup was done incrementally, reused 1.14 GiB (22%)
INFO: transferred 5.00 GiB in 12 seconds (426.7 MiB/s)
INFO: stopping kvm after backup task
INFO: Finished Backup of VM 114 (00:00:14)
INFO: Backup finished at 2020-12-03 18:36:40
INFO: Backup job finished successfully
TASK OK

Now after this, if trying to run a garbage collection, this will lead to another error. I had to reboot PBS to recover from this lock, remounting the CIFS storage didn't help:

Code:
2020-12-03T18:45:08+01:00: starting garbage collection on store storage
2020-12-03T18:45:08+01:00: TASK ERROR: unable to get exclusive lock - EACCES: Permission denied

Running a successful garbage collection after PBS reboot would still NOT fix the state of having broken backups for any VM where a prune/delete was attempted, as any further attempts will result in incremental backups being created eventhough there doesn't even exist a full backup. It looks like GC is not really deleting anything:

Code:
2020-12-03T18:52:58+01:00: starting garbage collection on store storage
2020-12-03T18:52:58+01:00: Start GC phase1 (mark used chunks)
2020-12-03T18:52:58+01:00: Start GC phase2 (sweep unused chunks)
2020-12-03T18:53:01+01:00: percentage done: phase2 1% (processed 10 chunks)
2020-12-03T18:53:03+01:00: percentage done: phase2 2% (processed 21 chunks)
<...>
2020-12-03T18:56:43+01:00: percentage done: phase2 98% (processed 966 chunks)
2020-12-03T18:56:45+01:00: percentage done: phase2 99% (processed 976 chunks)
2020-12-03T18:56:47+01:00: Removed garbage: 0 B
2020-12-03T18:56:47+01:00: Removed chunks: 0
2020-12-03T18:56:47+01:00: Pending removals: 969.33 MiB (in 988 chunks)
2020-12-03T18:56:47+01:00: Original data usage: 0 B
2020-12-03T18:56:47+01:00: On-Disk chunks: 0
2020-12-03T18:56:47+01:00: Deduplication factor: 1.00
2020-12-03T18:56:47+01:00: TASK OK

Pending removals: 969.33 MiB. So why not removing it? How do I actually recover from this situation?

CIFS is the only option for this storage. Not being able to delete/prune on CIFS mounts is a critical bug which leads to the point that we will be unable to use this product in our environment.

Please advise.
 
Last edited:
Same behavior for me. Is tehere any way to patch this in an iso made installation? I am not able to find the file i must change.
Thank you in advance!
 
proxmox-backup-server_1.0.6-1 update has been released, which might include the aforementioned changes. On multiple installations, this update fails to install, apt just hangs forever.

Code:
The following NEW packages will be installed:
  mt-st mtx
The following packages will be upgraded:
  proxmox-backup-client proxmox-backup-docs proxmox-backup-server
3 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 13.3 MB of archives.
After this operation, 1,755 kB of additional disk space will be used.
Get:1 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-client amd64 1.0.6-1 [2,511 kB]
Get:2 http://deb.debian.org/debian buster/main amd64 mt-st amd64 1.3-2 [36.2 kB]
Get:3 http://deb.debian.org/debian buster/main amd64 mtx amd64 1.3.12-12 [104 kB]
Get:4 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-docs all 1.0.6-1 [2,165 kB]
Get:5 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-server amd64 1.0.6-1 [8,482 kB]
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin:
Fetched 13.3 MB in 0s (42.7 MB/s)
Selecting previously unselected package mt-st.
(Reading database ... 46933 files and directories currently installed.)
Preparing to unpack .../archives/mt-st_1.3-2_amd64.deb ...
Unpacking mt-st (1.3-2) ...
Selecting previously unselected package mtx.
Preparing to unpack .../mtx_1.3.12-12_amd64.deb ...
Unpacking mtx (1.3.12-12) ...
Preparing to unpack .../proxmox-backup-client_1.0.6-1_amd64.deb ...
Unpacking proxmox-backup-client (1.0.6-1) over (1.0.5-1) ...
Preparing to unpack .../proxmox-backup-docs_1.0.6-1_all.deb ...
Unpacking proxmox-backup-docs (1.0.6-1) over (1.0.4-1) ...
Preparing to unpack .../proxmox-backup-server_1.0.6-1_amd64.deb ...
Unpacking proxmox-backup-server (1.0.6-1) over (1.0.5-1) ...
Setting up proxmox-backup-docs (1.0.6-1) ...
Setting up mtx (1.3.12-12) ...
Setting up proxmox-backup-client (1.0.6-1) ...
Setting up mt-st (1.3-2) ...
update-alternatives: using /bin/mt-st to provide /bin/mt (mt) in auto mode
Setting up proxmox-backup-server (1.0.6-1) ...

Code:
21421 root       20   0 71520 59680 41356 S  0.0  3.0  0:01.46 │           └─ apt-get dist-upgrade -y
21568 root       20   0 12248  4464  2324 S  0.0  0.2  0:00.01 │              └─ /usr/bin/dpkg --status-fd 55 --configure --pending
21582 root       20   0  2384  1488  1388 S  0.0  0.1  0:00.00 │                 └─ /bin/sh /var/lib/dpkg/info/proxmox-backup-server.postinst configure 1.0.5-1
21660 root       20   0 10836  3516  3240 S  0.0  0.2  0:00.00 │                    └─ /bin/systemctl try-reload-or-restart proxmox-backup.service proxmox-backup-proxy.service

Code:
Dec 14 14:47:52 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:49:22 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:50:52 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:52:23 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:53:53 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:55:23 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:56:53 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:58:24 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:59:54 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:01:24 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:02:54 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:03:04 backup systemd[1]: proxmox-backup-proxy.service: Reload operation timed out. Killing reload process.
Dec 14 15:04:35 backup systemd[1]: proxmox-backup-proxy.service: Reload operation timed out. Killing reload process.

I had to manually kill the dpkg process and run dpkg --configure -a for the update to "succeed", followed by a reboot for proxmox-backup-proxy to recover. I'll test later if the update includes the fix.
 
Last edited:
proxmox-backup-server_1.0.6-1 update has been released, which might include the aforementioned changes. On multiple installations, this update fails to install, apt just hangs forever.

Code:
The following NEW packages will be installed:
  mt-st mtx
The following packages will be upgraded:
  proxmox-backup-client proxmox-backup-docs proxmox-backup-server
3 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 13.3 MB of archives.
After this operation, 1,755 kB of additional disk space will be used.
Get:1 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-client amd64 1.0.6-1 [2,511 kB]
Get:2 http://deb.debian.org/debian buster/main amd64 mt-st amd64 1.3-2 [36.2 kB]
Get:3 http://deb.debian.org/debian buster/main amd64 mtx amd64 1.3.12-12 [104 kB]
Get:4 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-docs all 1.0.6-1 [2,165 kB]
Get:5 http://download.proxmox.com/debian/pbs buster/pbs-no-subscription amd64 proxmox-backup-server amd64 1.0.6-1 [8,482 kB]
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin:
Fetched 13.3 MB in 0s (42.7 MB/s)
Selecting previously unselected package mt-st.
(Reading database ... 46933 files and directories currently installed.)
Preparing to unpack .../archives/mt-st_1.3-2_amd64.deb ...
Unpacking mt-st (1.3-2) ...
Selecting previously unselected package mtx.
Preparing to unpack .../mtx_1.3.12-12_amd64.deb ...
Unpacking mtx (1.3.12-12) ...
Preparing to unpack .../proxmox-backup-client_1.0.6-1_amd64.deb ...
Unpacking proxmox-backup-client (1.0.6-1) over (1.0.5-1) ...
Preparing to unpack .../proxmox-backup-docs_1.0.6-1_all.deb ...
Unpacking proxmox-backup-docs (1.0.6-1) over (1.0.4-1) ...
Preparing to unpack .../proxmox-backup-server_1.0.6-1_amd64.deb ...
Unpacking proxmox-backup-server (1.0.6-1) over (1.0.5-1) ...
Setting up proxmox-backup-docs (1.0.6-1) ...
Setting up mtx (1.3.12-12) ...
Setting up proxmox-backup-client (1.0.6-1) ...
Setting up mt-st (1.3-2) ...
update-alternatives: using /bin/mt-st to provide /bin/mt (mt) in auto mode
Setting up proxmox-backup-server (1.0.6-1) ...

Code:
21421 root       20   0 71520 59680 41356 S  0.0  3.0  0:01.46 │           └─ apt-get dist-upgrade -y
21568 root       20   0 12248  4464  2324 S  0.0  0.2  0:00.01 │              └─ /usr/bin/dpkg --status-fd 55 --configure --pending
21582 root       20   0  2384  1488  1388 S  0.0  0.1  0:00.00 │                 └─ /bin/sh /var/lib/dpkg/info/proxmox-backup-server.postinst configure 1.0.5-1
21660 root       20   0 10836  3516  3240 S  0.0  0.2  0:00.00 │                    └─ /bin/systemctl try-reload-or-restart proxmox-backup.service proxmox-backup-proxy.service

Code:
Dec 14 14:47:52 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:49:22 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:50:52 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:52:23 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:53:53 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:55:23 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:56:53 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:58:24 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 14:59:54 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:01:24 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:02:54 backup systemd[1]: proxmox-backup.service: Reload operation timed out. Killing reload process.
Dec 14 15:03:04 backup systemd[1]: proxmox-backup-proxy.service: Reload operation timed out. Killing reload process.
Dec 14 15:04:35 backup systemd[1]: proxmox-backup-proxy.service: Reload operation timed out. Killing reload process.

I had to manually kill the dpkg process and run dpkg --configure -a for the update to "succeed", followed by a reboot for proxmox-backup-proxy to recover. I'll test later if the update includes the fix.

Hi!

For me the update went smooth
Then I did a backup of a CT and removed it via the PBS Datastore overview via the red trashbin icon and received the same error :(

So I guess the patch wasn't integrated yet


Also do you know how to 'reset' a backup job for a VM/CT because when we delete snapshot it will just assume that the last snapshot is still there and the whole backup is damaged? Do we need to delete more files on the disks via cli or is it fine to just delete the folder?
 
Last edited:
Also do you know how to 'reset' a backup job for a VM/CT because when we delete snapshot it will just assume that the last snapshot is still there and the whole backup is damaged? Do we need to delete more files on the disks via cli or is it fine to just delete the folder?

That's a good question which I hope will be answerred by someone from Proxmox.

Since garbage collection is also broken, dropping another locking error, I was unable to recover the backup group from this situation and eventually ended up deleting and recreating the entire datastore. Hopefully there is a better way, and it would also be good to see this documented.

Code:
2020-12-03T18:45:08+01:00: starting garbage collection on store storage
2020-12-03T18:45:08+01:00: TASK ERROR: unable to get exclusive lock - EACCES: Permission denied
 
Found another/still existing issue (1.0.6-1). When a backup fails, unfinished backups cannot be removed automatically:

Code:
2020-12-20T20:01:22+01:00: backup ended and finish failed: backup ended but finished flag is not set.
2020-12-20T20:01:22+01:00: removing unfinished backup
2020-12-20T20:01:24+01:00: TASK ERROR: removing backup snapshot "/mnt/nas-11/proxmox-10/vm/102/2020-12-20T19:01:12Z" failed - Directory not empty (os error 39)

However, removing the backup (empty folder!) manually works.
 
the patch i referenced should be included in 1.0.6-1
 
I had the "Directory not empty (os error 39)" error on my NFS datastore.
The error was shown on backup pruning (trying to delete backups older than the retention).

Upgrading to 1.0.6-1 seems to have fixed it.

I upgraded yesterday then forced a "Datastore prune" and it went OK.
No more error in the datastore content and the "too old" backups were deleted.
Datastore pruning and garbage collection that happened last night ar OK too.
 
  • Like
Reactions: netvoice

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!