backup job failed with err -11 on 2 of 6 VM's

Tealk · Jul 14, 2021

Hello,

yesterday I upgraded the backup server and VE to the latest version. Now I get the following error message during the daily backup on 2 VM's:

Code:

INFO: Starting Backup of VM 104 (qemu)
INFO: Backup started at 2021-07-14 00:34:38
INFO: status = running
INFO: VM Name: Mastodon
INFO: include disk 'scsi0' 'data:vm-104-disk-0' 80G
INFO: include disk 'scsi1' 'HDD:vm-104-disk-0' 300G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/104/2021-07-13T22:34:38Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'c24491a3-6755-47ec-b1c1-18fafafe64a6'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO:   0% (112.0 MiB of 380.0 GiB) in 1s, read: 112.0 MiB/s, write: 12.0 MiB/s
ERROR: job failed with err -11 - Resource temporarily unavailable
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 104 failed - job failed with err -11 - Resource temporarily unavailable
INFO: Failed at 2021-07-14 00:34:42

fiona · Jul 14, 2021

Hi,
is there anything obviously different with those two VMs? What does the task log on the backup server say? Please also share the output of qm status 104 --verbose and qm config 104.

Tealk · Jul 14, 2021

Thanks for the quick reply
The only difference I can think of is that the 2 VM's have a second hard drive, an HDD.

lof of backup server:

Code:

2021-07-14T00:34:39+02:00: starting new backup on datastore 'backup': "vm/104/2021-07-13T22:34:38Z"
2021-07-14T00:34:39+02:00: download 'index.json.blob' from previous backup.
2021-07-14T00:34:39+02:00: register chunks in 'drive-scsi0.img.fidx' from previous backup.
2021-07-14T00:34:39+02:00: download 'drive-scsi0.img.fidx' from previous backup.
2021-07-14T00:34:39+02:00: created new fixed index 1 ("vm/104/2021-07-13T22:34:38Z/drive-scsi0.img.fidx")
2021-07-14T00:34:39+02:00: register chunks in 'drive-scsi1.img.fidx' from previous backup.
2021-07-14T00:34:40+02:00: download 'drive-scsi1.img.fidx' from previous backup.
2021-07-14T00:34:40+02:00: created new fixed index 2 ("vm/104/2021-07-13T22:34:38Z/drive-scsi1.img.fidx")
2021-07-14T00:34:40+02:00: add blob "/home/backup/vm/104/2021-07-13T22:34:38Z/qemu-server.conf.blob" (397 bytes, comp: 397)
2021-07-14T00:34:41+02:00: backup ended and finish failed: backup ended but finished flag is not set.
2021-07-14T00:34:41+02:00: removing unfinished backup
2021-07-14T00:34:41+02:00: TASK ERROR: backup ended but finished flag is not set.

Code:

qm status 104 --verbose
balloon: 8589934592
ballooninfo:
        actual: 8589934592
        free_mem: 198492160
        last_update: 1626246160
        major_page_faults: 3585
        max_mem: 8589934592
        mem_swapped: 2789376
        mem_swapped_out: 83492864
        minor_page_faults: 6142419
        total_mem: 8365924352
blockstat:
        ide2:
                account_failed: 0
                account_invalid: 0
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 35003760317954
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 152
                rd_merged: 0
                rd_operations: 4
                rd_total_time_ns: 90541
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 0
                wr_highest_offset: 0
                wr_merged: 0
                wr_operations: 0
                wr_total_time_ns: 0
        scsi0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 68495
                flush_total_time_ns: 38268790603
                idle_time_ns: 31899838
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 1500784128
                rd_merged: 0
                rd_operations: 124092
                rd_total_time_ns: 32151319986
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 1639976960
                wr_highest_offset: 85290221568
                wr_merged: 0
                wr_operations: 104454
                wr_total_time_ns: 586745053341
        scsi1:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 283
                flush_total_time_ns: 3696962539
                idle_time_ns: 68392931333
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 73031680
                rd_merged: 0
                rd_operations: 9976
                rd_total_time_ns: 61496760759
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 26660864
                wr_highest_offset: 110225338368
                wr_merged: 0
                wr_operations: 1418
                wr_total_time_ns: 1211161675
cpus: 4
disk: 0
diskread: 1573815960
diskwrite: 1666637824
freemem: 198492160
maxdisk: 85899345920
maxmem: 8589934592
mem: 8167432192
name: Mastodon
netin: 115338897
netout: 452714091
nics:
        tap104i0:
                netin: 115338897
                netout: 452714091
pid: 2320
proxmox-support:
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-dirty-bitmap-savevm: 1
        pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
        pbs-masterkey: 1
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-6.0+pve0
running-qemu: 6.0.0
status: running
uptime: 35015
vmid: 104

Code:

qm config 104
agent: 1
bootdisk: scsi0
cores: 4
description: Mastodon Instanz rollenspiel.social%0ASupport support.rollenspiel.monster%0AElasticSearch%0AHalcyon halcyon.rollenspiel.social
ide2: none,media=cdrom
memory: 8192
name: Mastodon
net0: virtio=16:AD:10:97:4E:94,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: data:vm-104-disk-0,size=80G
scsi1: HDD:vm-104-disk-0,size=300G
scsihw: virtio-scsi-pci
smbios1: uuid=578327ef-b5af-4724-b23b-98d3e6095464
sockets: 1
vmgenid: cc996122-4b6a-471c-9361-0ef607f4c9ce

fiona · Jul 14, 2021

Could you also share the storage configuration in /etc/pve/storage.cfg for data and HDD? Are the other VMs also using those storages?

Tealk · Jul 14, 2021

Code:

lvmthin: data
    thinpool data
    vgname vg0
    content images,rootdir

lvm: HDD
    vgname HDD
    content rootdir,images
    shared 0

ulf · Jul 14, 2021

I can confirm that i experience the same problem as member Tealk.

fiona · Jul 15, 2021

Hi,

ulf said:
I can confirm that i experience the same problem as member Tealk.

I wasn't able to reproduce the issue yet. Could you also share the VM and storage configuration and backup logs from PVE and PBS?

Tealk · Jul 15, 2021

I noticed something else, I have a TemplateVM this has its hard drive on the HDD also there the backup fails.
Can it be that the HDD somehow reacts too slowly or do you have to set up something with HDD's?

labsit · Jul 15, 2021

I had the same problem...

It only allowed me to back up to hdd in directory mode.
But in proxmox backup server threw the same error
Solve this by deleting the vm and creating it again from the backup. After this I was able to back up to PBS

scenario
2 nodes
node 1 ve6.4 node 2 ve7

Coincidentally the two vm that threw error were migrated live from ve6.4 to ve7

I hope it will help you, I am still careful that the problem does not reappear.

ulf · Jul 15, 2021

Fabian_E said:
Hi,

I wasn't able to reproduce the issue yet. Could you also share the VM and storage configuration and backup logs from PVE and PBS?

root@uler:~# qm status 300 --verbose
balloon: 8589934592
ballooninfo:
actual: 8589934592
free_mem: 639750144
last_update: 1626359572
major_page_faults: 4772
max_mem: 8589934592
mem_swapped_in: 8716288
mem_swapped_out: 46297088
minor_page_faults: 62295771
total_mem: 8348606464
blockstat:
ide2:
account_failed: 0
account_invalid: 0
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 0
flush_total_time_ns: 0
idle_time_ns: 69277138242863
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 152
rd_merged: 0
rd_operations: 4
rd_total_time_ns: 59752
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
scsi0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 380391
flush_total_time_ns: 758130428072
idle_time_ns: 4869075389
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 2075280896
rd_merged: 0
rd_operations: 48543
rd_total_time_ns: 23693577488
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 30786204672
wr_highest_offset: 66574340096
wr_merged: 0
wr_operations: 681677
wr_total_time_ns: 405193433546
scsi1:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 0
flush_total_time_ns: 0
idle_time_ns: 69276926003869
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 4423680
rd_merged: 0
rd_operations: 197
rd_total_time_ns: 15032373
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
scsi2:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 534
flush_total_time_ns: 34865582089
idle_time_ns: 109913799039
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 16947425280
rd_merged: 0
rd_operations: 67923
rd_total_time_ns: 363354230406
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 857153536
wr_highest_offset: 2248799592448
wr_merged: 0
wr_operations: 12793
wr_total_time_ns: 21682084009
cpus: 3
disk: 0
diskread: 19027130008
diskwrite: 31643358208
freemem: 639750144
maxdisk: 68719476736
maxmem: 8589934592
mem: 7708856320
name: UB20NextcULER
netin: 3268590434
netout: 88922364
nics:
tap300i0:
netin: 3268590434
netout: 88922364
pid: 4959
proxmox-support:
pbs-dirty-bitmap: 1
pbs-dirty-bitmap-migration: 1
pbs-dirty-bitmap-savevm: 1
pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
pbs-masterkey: 1
query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-6.0+pve0
running-qemu: 6.0.0
status: running
uptime: 69285
vmid: 300
root@uler:~#

__________________________________________________________________-
dir: local
path /var/lib/vz
content iso,backup,vztmpl

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

dir: ssd2tb
path /mnt/ssd2tb
content images,iso
prune-backups keep-all=1
shared 0

dir: nextcldata
path /mnt/nextcldata
content images
prune-backups keep-all=1
shared 0

ulf · Jul 15, 2021

INFO: starting new backup job: vzdump 300 --mode snapshot --node uler --remove 0 --storage ulerLokal
INFO: Starting Backup of VM 300 (qemu)
INFO: Backup started at 2021-07-15 20:32:06
INFO: status = running
INFO: VM Name: UB20NextcULER
INFO: include disk 'scsi0' 'local-lvm:vm-300-disk-0' 64G
INFO: include disk 'scsi1' 'local-lvm:vm-300-disk-1' 1G
INFO: include disk 'scsi2' '/dev/disk/by-id/ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E2KCR432-part1' 3815446M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/300/2021-07-15T18:32:06Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '523092c7-8902-430f-a402-17d224f8904a'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi1: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi2: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 0% (112.0 MiB of 3.7 TiB) in 1s, read: 112.0 MiB/s, write: 100.0 MiB/s
ERROR: job failed with err -11 - Resource temporarily unavailable
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 300 failed - job failed with err -11 - Resource temporarily unavailable
INFO: Failed at 2021-07-15 20:32:08
INFO: Backup job finished with errors
TASK ERROR: job errors

Tealk · Jul 15, 2021

same problem with the ct's
The ct have been set up with the 7.0

Code:

INFO: starting new backup job: vzdump 202 --remove 0 --mode snapshot --node AnzahProxmox --storage ProxBack
INFO: Starting Backup of VM 202 (lxc)
INFO: Backup started at 2021-07-15 22:22:51
INFO: status = running
INFO: CT Name: WEB
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/HDD') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: WEB
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/HDD') in backup
INFO: starting first sync /proc/1380358/root/ to /var/tmp/vzdumptmp1852893_202
ERROR: Backup of VM 202 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --sparse --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' /proc/1380358/root//./ /proc/1380358/root//./mnt/HDD /var/tmp/vzdumptmp1852893_202' failed: exit code 11
INFO: Failed at 2021-07-15 22:33:16
INFO: Backup job finished with errors
TASK ERROR: job errors

Tealk · Jul 16, 2021

i just tested what happens when i exclude the hdd from the backup; then the backup works

17:35 is the backup without hdd
17:36 is the backup with hdd

testet with

Code:

proxmox-backup-client: 2.0.4-1
proxmox-backup-file-restore: 2.0.4-1
proxmox-widget-toolkit: 3.3-4

Tealk · Jul 17, 2021

Is there anything else I can help find the error @Fabian_E ?
Because I feel very uncomfortable without backups

fiona · Jul 19, 2021

For the VMs, could you try using aio=threads for the drive(s) residing on LVM, reboot (or shutdown) the VM and see if it works then? Needs to be manually added in the VM config, e.g.

Code:

scsi1: lvm:vm-107-disk-1,size=4G,aio=threads

Tealk · Jul 19, 2021

Fabian_E said:
For the VMs, could you try using aio=threads for the drive(s) residing on LVM, reboot (or shutdown) the VM and see if it works then? Needs to be manually added in the VM config, e.g.

Code:

scsi1: lvm:vm-107-disk-1,size=4G,aio=threads

Where exactly do I enter this? So where is this Cofig file?

fiona · Jul 19, 2021

The configuration file is in /etc/pve/qemu-server/<ID>.conf or /etc/pve/nodes/<nodename>/qemu-server/<ID>.conf if you're not on the same node.

Tealk · Jul 19, 2021

Code:

scsi0: data:vm-107-disk-0,size=50G
scsi1: HDD:vm-107-disk-0,size=100G,aio=threads

With the following settings the backup is running right now, it is not finished but did not stop at the hdd either

Zelario · Jul 19, 2021

Hi, I am getting this exact same error since upgrading to v7. This happens on two different backup jobs, both backup the same VM's (One Ubuntu one Win Server) but one goes to a local USB HDD and the other goes to a mounted network share.

Trying again now with the aio=threads bit added and initially seems to be working.

fiona · Jul 19, 2021

Tealk said:
Code:

scsi0: data:vm-107-disk-0,size=50G scsi1: HDD:vm-107-disk-0,size=100G,aio=threads

With the following settings the backup is running right now, it is not finished but did not stop at the hdd either

Zelario said:
Hi, I am getting this exact same error since upgrading to v7. This happens on two different backup jobs, both backup the same VM's (One Ubuntu one Win Server) but one goes to a local USB HDD and the other goes to a mounted network share.

Trying again now with the aio=threads bit added and initially seems to be working.

Thanks for reporting to both of you! @Zelario could you also share the storage configuration and for the VM configuration for the affected VMs? Is there any LVM involved?

backup job failed with err -11 on 2 of 6 VM's

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

New Member

Proxmox Staff Member

Active Member

Member

New Member

New Member

Active Member

Active Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

New Member

Proxmox Staff Member