Backup failed when VM disk are in specific storage

j5boot

Renowned Member
May 18, 2011
13
0
66
Spain
I've been testing Proxmox Backup Server for a couple of days now and I've found that some machines are not backing up.

On Proxmox VE I have two different storages, one called Smith (on mechanical disks with hardware RAID) and another one called MaquinasSSD with a SSD disk.

The machine that has some disk in the MaquinasSSD storage fails to backup, it gives the following error:

Code:
2021-07-23T06:38:59+02:00: starting new backup on datastore 'Data': "vm/1308/2021-07-23T04:38:58Z"
2021-07-23T06:38:59+02:00: GET /previous: 400 Bad Request: no valid previous backup
2021-07-23T06:38:59+02:00: created new fixed index 1 ("vm/1308/2021-07-23T04:38:58Z/drive-virtio0.img.fidx")
2021-07-23T06:38:59+02:00: created new fixed index 2 ("vm/1308/2021-07-23T04:38:58Z/drive-virtio1.img.fidx")
2021-07-23T06:38:59+02:00: add blob "/mnt/datastore/Data/vm/1308/2021-07-23T04:38:58Z/qemu-server.conf.blob" (375 bytes, comp: 375)
2021-07-23T06:39:00+02:00: backup failed: connection error: bytes remaining on stream
2021-07-23T06:39:00+02:00: removing failed backup
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: error reading a body from connection: broken pipe
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: error reading a body from connection: broken pipe
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
2021-07-23T06:39:00+02:00: TASK ERROR: connection error: bytes remaining on stream
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
2021-07-23T06:39:00+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.

If I move the disk from MaquinasSSD to Smith, the backup is successful.

Any idea what could be happening? Both Smith and MaquinasSSD storage have the same configuration, they are LVM disks.

It only happens with machines that have at least one disk in the MaquinasSSD storage, regardless of whether they are powered on or off and regardless of the guest operating system.

Thank you very much

pvs command
Code:
  PV         VG          Fmt  Attr PSize    PFree 
  /dev/sda   Smith       lvm2 a--    <7,28t <875,58g
  /dev/sdb2  MaquinasSSD lvm2 a--  <417,13g <209,13g
(PVE is allocated in sdb1)

vgs command
Code:
  VG          #PV #LV #SN Attr   VSize    VFree 
  MaquinasSSD   1   5   0 wz--n- <417,13g <209,13g
  Smith         1  72   0 wz--n-   <7,28t <875,58g

/etc/pve/storage.cfg

Code:
dir: local
        path /var/lib/vz
        content rootdir,images,snippets
        prune-backups keep-last=1
        shared 0

lvm: Maquinas
        vgname Smith
        content images,rootdir
        shared 0

lvm: MaquinasSSD
        vgname MaquinasSSD
        content images,rootdir
        shared 0

nfs: Seraph
        export /volume1/VZ
        path /mnt/pve/Seraph
        server 10.0.1.100
        content vztmpl,iso,backup
        prune-backups keep-last=4

dir: LaboratorioVirtual
        path /mnt/pve/Seraph/LaboratorioVirtual
        content vztmpl,iso
        shared 1

pbs: Sion
        datastore Data
        server 10.0.1.99
        content backup
        fingerprint xx:xx:xx:xx:xx:xx:xx
        prune-backups keep-all=1
        username root@pam

PVE version is 7.0-10
Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-2-pve)
pve-manager: 7.0-10 (running version: 7.0-10/d2f465d3)
pve-kernel-5.11: 7.0-5
pve-kernel-helper: 7.0-5
pve-kernel-5.4: 6.4-4
pve-kernel-5.11.22-2-pve: 5.11.22-3
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
ceph-fuse: 14.2.21-1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.2.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-5
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-9
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.0.4-1
proxmox-backup-file-restore: 2.0.4-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-5
pve-cluster: 7.0-3
pve-container: 4.0-8
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-10
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

PBS version 2.0-4 installed from ISO
Code:
proxmox-backup-server 2.0.6-1 running version: 2.0.4


11:53 - Additional info.

Updated proxmox-backup-client to 2.0.7-1 and the same happens.
 
Last edited:
can you post the relevant pve backup task log too ?
 
Yes, this is the log:

Code:
INFO: starting new backup job: vzdump 1501 --remove 0 --node smith --storage Sion --mode snapshot
INFO: Starting Backup of VM 1501 (qemu)
INFO: Backup started at 2021-07-23 13:42:26
INFO: status = running
INFO: VM Name: matlab.lab.inf.uva.es
INFO: include disk 'scsi0' 'MaquinasSSD:vm-1501-disk-0' 70G
INFO: exclude disk 'scsi1' 'Maquinas:vm-1501-disk-0' (backup=no)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/1501/2021-07-23T11:42:26Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'bc0fbd71-0940-4c92-bfde-0ba6380f6998'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO:   0% (244.0 MiB of 70.0 GiB) in 3s, read: 81.3 MiB/s, write: 81.3 MiB/s
...
INFO:  98% (68.9 GiB of 70.0 GiB) in 44m 5s, read: 22.0 MiB/s, write: 16.6 MiB/s
ERROR: job failed with err -11 - Resource temporarily unavailable
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 1501 failed - job failed with err -11 - Resource temporarily unavailable
INFO: Failed at 2021-07-23 14:26:37
INFO: Backup job finished with errors
TASK ERROR: job errors

I have been trying more things and if I put in the problematic disk the parameter aio=threads the copy is done.
 
I have been trying more things and if I put in the problematic disk the parameter aio=threads the copy is done.
ok, then thats the issue + workaround
there seems to be quite a few users which have problems with the new default of io_uring
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!