container backup fail

RobFantini

Famous Member
May 24, 2012
2,023
107
133
Boston,Mass
Hello

We have around 15 containers. Just one has backup failures - 2 times in the past 4 days

here is more info:
Code:
dmesg
[Sat Oct 14 08:44:46 2023] rbd: rbd1: capacity 15032385536 features 0x1d
[Sat Oct 14 08:44:46 2023] /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-604-disk-0@vzdump: Can't open blockdev

ignore config line: performance max-workers=1
INFO: starting new backup job: vzdump 604 --node pve15 --notes-template '{{guestname}}' --storage y-nfs-share --remove 0 --mode snapshot --compress zstd
INFO: Starting Backup of VM 604 (lxc)
INFO: Backup started at 2023-10-14 08:44:45
INFO: status = running
INFO: CT Name: bc-sys4
INFO: including mount point rootfs ('/') in backup
INFO: excluding volume mount point mp0 ('/var/lib/bluecherry/recordings') from backup (disabled)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
Creeting snap: 10% complete...
Creating snap: 100% complete...done.
/dev/rbd1
mount: /mnt/vzsnap0: special device /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-604-disk-0@vzdump does not exist.
       dmesg(1) may have more information after failed mount system call.
umount: /mnt/vzsnap0/: not mounted.
command 'umount -l -d /mnt/vzsnap0/' failed: exit code 32
ERROR: Backup of VM 604 failed - command 'mount -o ro,noload /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-604-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 32
INFO: Failed at 2023-10-14 08:44:46
INFO: Backup job finished with errors
TASK ERROR: job errors

we are running the latest versions of PVE from
deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise

Also the storage config :

dir: y-nfs-share
path /media/pbs-nfs
content iso,vztmpl,backup
prune-backups keep-last=1
shared 1
 
Last edited:
Also the 1st post error was using NFS storage.

the same result occurs when using local storage .

604: 2023-10-15 02:10:49 INFO: Starting Backup of VM 604 (lxc)
604: 2023-10-15 02:10:49 INFO: status = running
604: 2023-10-15 02:10:49 INFO: CT Name: bc-sys4
604: 2023-10-15 02:10:49 INFO: including mount point rootfs ('/') in backup
604: 2023-10-15 02:10:49 INFO: excluding volume mount point mp0 ('/var/lib/bluecherry/recordings') from backup (disabled)
604: 2023-10-15 02:10:49 INFO: backup mode: snapshot
604: 2023-10-15 02:10:49 INFO: ionice priority: 7
604: 2023-10-15 02:10:49 INFO: create storage snapshot 'vzdump'
604: 2023-10-15 02:10:50 ERROR: Backup of VM 604 failed - command 'mount -o ro,noload /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-604-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 32
 
the issue seems to be with the PVE host.

I can not restore a PCT backup. KVM restore works okay.
here is part of the output:

Code:
recovering backed-up configuration from 'pbs-daily:backup/ct/604/2023-08-31T20:37:10Z'
/dev/rbd0
The file /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-6044-disk-0 does not exist and no size was specified.
Removing image: 1% complete...
Removing image: 2% complete...
..
TASK ERROR: unable to restore CT 6044 - command 'mkfs.ext4 -O mmp -E 'root_owner=0:0' /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-6044-disk-0' failed: exit code 1


---
and another
recovering backed-up configuration from 'pbs-daily:backup/ct/603/2023-10-16T20:16:59Z'
/dev/rbd0
The file /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-117-disk-0 does not exist and no size was specified.
Removing image: 1% complete...
Removing image: 100% complete...done.
TASK ERROR: unable to restore CT 117 - command 'mkfs.ext4 -O mmp -E 'root_owner=0:0' /dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-117-disk-0' failed: exit code 1

KVM restore OK:
Code:
new volume ID is 'nvme-4tb:vm-117-disk-0'
restore proxmox backup image: /usr/bin/pbs-restore --repository pbs-user@pbs@10.11.12.80:daily vm/223/2023-09-16T20:11:44Z drive-scsi0.img.fidx 'rbd:nvme-4tb/vm-117-disk-0:conf=/etc/pve/ceph.conf' --verbose --format raw --skip-zero
connecting to repository 'pbs-user@pbs@10.11.12.80:daily'
open block backend for target 'rbd:nvme-4tb/vm-117-disk-0:conf=/etc/pve/ceph.conf'
starting to restore snapshot 'vm/223/2023-09-16T20:11:44Z'
download and verify backup index
progress 1% (read 218103808 bytes, zeroes = 0% (0 bytes), duration 0 sec)
progress 2% (read 432013312 bytes, zeroes = 0% (0 bytes), duration 1 sec)
progress 3% (read 645922816 bytes, zeroes = 8% (54525952 bytes), duration 2 sec)
progress 4% (read 859832320 bytes, zeroes = 6% (54525952 bytes), duration 3 sec)
progress 5% (read 1073741824 bytes, zeroes = 5% (54525952 bytes), duration 3 sec)
..
progress 100% (read 21474836480 bytes, zeroes = 37% (8078229504 bytes), duration 50 sec)
restore image complete (bytes=21474836480, duration=50.20s, speed=407.93MB/s)
rescan volumes...
TASK OK

restore PCT works on another node:

Code:
recovering backed-up configuration from 'pbs-daily:backup/ct/604/2023-10-16T20:38:58Z'
/dev/rbd5
Creating filesystem with 3670016 4k blocks and 917504 inodes
Filesystem UUID: 9057994c-d097-4ec0-a47d-204b1c82f3e0
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208
/dev/rbd6
Creating filesystem with 891289600 4k blocks and 222822400 inodes
Filesystem UUID: b258f03e-a7ce-416e-a6f0-1812774ffb99
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848, 512000000, 550731776, 644972544
restoring 'pbs-daily:backup/ct/604/2023-10-16T20:38:58Z' now..
Detected container architecture: amd64
merging backed-up and given configuration..
TASK OK
 
i see the PCT restore that worked above was from a different backup. so tested restoring the same backup which failed on another node:
Code:
recovering backed-up configuration from 'pbs-daily:backup/ct/604/2023-08-31T20:37:10Z'
/dev/rbd7
Creating filesystem with 3670016 4k blocks and 917504 inodes
Filesystem UUID: 43d02f5f-bf16-4d75-8f22-8be6f349afe9
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208
/dev/rbd8
Creating filesystem with 891289600 4k blocks and 222822400 inodes
Filesystem UUID: eb5d21e1-5c6d-47a3-b498-ab488630e6be
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848, 512000000, 550731776, 644972544
restoring 'pbs-daily:backup/ct/604/2023-08-31T20:37:10Z' now..
Detected container architecture: amd64
merging backed-up and given configuration..
TASK OK
 
after rebooting the node I could restore the PCT:
Code:
recovering backed-up configuration from 'pbs-daily:backup/ct/604/2023-08-31T20:37:10Z'
/dev/rbd0
Creating filesystem with 3670016 4k blocks and 917504 inodes
Filesystem UUID: 4ff60784-0624-452e-abdf-b21ba0f165a5
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208
/dev/rbd1
Creating filesystem with 891289600 4k blocks and 222822400 inodes
Filesystem UUID: 04adc21e-6013-4871-aa3e-9cebb8714c67
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848, 512000000, 550731776, 644972544
restoring 'pbs-daily:backup/ct/604/2023-08-31T20:37:10Z' now..
Detected container architecture: amd64
merging backed-up and given configuration..
TASK OK

also prior to reboot I did not see anything unusual in dmesg output.
 
Hi,
sounds like there was an issue with the udev-rule creating the paths below /dev/rbd-pve (i.e. /usr/lib/udev/rules.d/50-rbd-pve.rules). In the VM restore task, it can be seen that the image was opened via librbd instead of the /dev/-path, so that wasn't affected. Should the issue happen again, please provide the output of
Code:
pveversion -v
stat /dev/rbd-pve
ls -l /dev/rbd-pve
 
Ok so these started again. I thought updates and reboots after I last posted had solved the issue..
This time backup to PBS worked, local vzdump failed .

Code:
# stat /dev/rbd-pve
  File: /dev/rbd-pve
  Size: 60              Blocks: 0          IO Block: 4096   directory
Device: 0,5     Inode: 2238        Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-12-05 18:46:23.049870485 -0500
Modify: 2023-12-05 18:46:23.049870485 -0500
Change: 2023-12-05 18:46:23.049870485 -0500
 Birth: 2023-12-05 18:46:23.049870485 -0500
Code:
# ls -l /dev/rbd-pve
total 0
drwxr-xr-x 3 root root 60 Dec  5 18:46 220b9a53-4556-48e3-a73c-28deff665e45/
Code:
# pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.11-6-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5: 6.5.11-6
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
proxmox-kernel-6.5.11-4-pve-signed: 6.5.11-4
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
ceph: 18.2.0-pve2
ceph-fuse: 18.2.0-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
pve-zsync: 2.3.0
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve4

Code:
105: 2023-12-10 02:00:52 INFO: Starting Backup of VM 105 (lxc)
105: 2023-12-10 02:00:52 INFO: status = running
105: 2023-12-10 02:00:52 INFO: CT Name: ona
105: 2023-12-10 02:00:52 INFO: including mount point rootfs ('/') in backup
105: 2023-12-10 02:00:52 INFO: backup mode: snapshot
105: 2023-12-10 02:00:52 INFO: ionice priority: 7
105: 2023-12-10 02:00:52 INFO: create storage snapshot 'vzdump'
105: 2023-12-10 02:00:53 ERROR: Backup of VM 105 failed - command 'mount -o ro,noload
/dev/rbd-pve/220b9a53-4556-48e3-a73c-28deff665e45/nvme-4tb/vm-105-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 32

* Notes - I can now backup the container that failed. so the first two reports may not help.
The last time trying to backup would not work..
Only one PCT backup failed out of approx 20.
The failure happened yesterday and today to two different pct's on the same node
 
Last edited:
How does the load on the sever and on the storage look like? The only thing I can image is that there could be a short time window between mapping the RBD snapshot and the device link appearing, but this is just speculation and I wasn't able to reproduce a problematic situation on my end.

What you could try is creating a test image/snapshot and running
Code:
rbd -p <pool> map <test image>@<test snapshot> && stat /dev/rbd-pve/<your fsid>/<pool>/<test image>@<test snapshot> && rbd -p <pool> unmap <test image>@<test snapshot>
multiple times and see if that ever fails.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!