Backup of shutdown VM that uses assigned pci devices

Tobbe

Member
Oct 4, 2021
21
5
8
Hi

I have run into a bit of an odd problem i can party understand it but it is still strange.
I have two vms, an old and a new one where the old one is before an upgrade i did.
i wanted to backup the old vm that is shut down.

So i did a onetime backup the normal way and this failed.
BUT the bad part is, it ALSO screwed up the running vm quite badly to the point that i had to forcibly stop it and restart it.

The backup log in question:
Code:
INFO: starting new backup job: vzdump 104 --storage backup --node pve1 --remove 0 --mode snapshot --compress zstd
INFO: Starting Backup of VM 104 (qemu)
INFO: Backup started at 2021-10-04 15:43:56
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: asdf
INFO: include disk 'scsi0' 'local-ssd:104/vm-104-disk-0.qcow2' 8G
INFO: creating vzdump archive '/mnt/pve/backup/dump/vzdump-qemu-104-2021_10_04-15_43_56.vma.zst'
INFO: starting kvm to execute backup task
kvm: -device vfio-pci,host=0000:01:00.0,id=hostpci0,bus=pci.0,addr=0x10: vfio 0000:01:00.0: failed to open /dev/vfio/15: Device or resource busy
ERROR: Backup of VM 104 failed - start failed: QEMU exited with code 1
INFO: Failed at 2021-10-04 15:43:56
INFO: Backup job finished with errors
TASK ERROR: job errors

Note the line aboit vfio-pci.
Clearly when doing a backup proxmox starts up the vm AND doesn't pay attention to the assigned pci devices and then probably steps right over the running vm since the old vm is a direct copy of the running one including the same hardware devices assigned (one of the reasons i have no intention to startup the vm and is doing a backup).

Several questions:
Why is the backup process touching any of the pci devices in this case?
Even if it had to, why doesn't it take into account that the device is already active in another vm and then fail gracefully while trying to do the backup?
 
Last edited:
Why is the backup process touching any of the pci devices in this case?
the backup must start the vm, and in normal order will reset the device/prepare the vfio group etc.

Even if it had to, why doesn't it take into account that the device is already active in another vm and then fail gracefully while trying to do the backup?
pci assignment is per vm and there are currently no safeguards for dual assignment etc. but we know this and we want to improve that. no time frame though.

what you could do for now, is to add a vm hookscript and check for that in the 'pre-start' phase and abort
 
the backup must start the vm, and in normal order will reset the device/prepare the vfio group etc.


pci assignment is per vm and there are currently no safeguards for dual assignment etc. but we know this and we want to improve that. no time frame though.

what you could do for now, is to add a vm hookscript and check for that in the 'pre-start' phase and abort
ok, thanks.
it was that lack of locking around the use of the pci devices that was a bit unexpected.

i came from a setup using libvirt and virt-manage and it will refuse to start a vm if the resources is already in use so i assumed (incorrectly) proxmox also would have some kind of locking.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!