I have a backup job set up in my data center that backs up 4 of my VMs to a Proxmox Backup Server. 3 of those 4 VMs are set up to passthrough the same GPU (hence, they can't be run in parallel). This seems to be an issue when backing up as the backup for the first VM with that GPU succeeds but the backups for the other VMs fail as the logs say that the GPU is still in use by the first backed up VM.
This seems like a bug to me. Why would the backup need the GPU or any other PCI or USB device? I am running the latest Proxmox 8.2.2 btw.
Here's an excerpt from the backup log showing the transition from the first successful VM to the second, failing VM (0000:03:00 is the common GPU). Also note that the successful VM had to be sigkilled which might contribute to the issue. I do not know why Sigterm does not have any effect. When booting normally, I can shutdown this Debian VM without issues.
Addition: If I run one of the three VMs with the GPU while the backup job is running, only the VM that does not have the GPU succeeds in creating a backup, which seems like a logical consequence.
This seems like a bug to me. Why would the backup need the GPU or any other PCI or USB device? I am running the latest Proxmox 8.2.2 btw.
Here's an excerpt from the backup log showing the transition from the first successful VM to the second, failing VM (0000:03:00 is the common GPU). Also note that the successful VM had to be sigkilled which might contribute to the issue. I do not know why Sigterm does not have any effect. When booting normally, I can shutdown this Debian VM without issues.
INFO: backup is sparse: 496.21 GiB (96%) total zero data
INFO: backup was done incrementally, reused 512.00 GiB (100%)
INFO: transferred 512.00 GiB in 190 seconds (2.7 GiB/s)
INFO: stopping kvm after backup task
VM quit/powerdown failed - terminating now with SIGTERM
VM still running - terminating now with SIGKILL
INFO: adding notes to backup
INFO: prune older backups with retention: keep-weekly=3
INFO: running 'proxmox-backup-client prune' for 'vm/200'
INFO: pruned 1 backup(s) not covered by keep-retention policy
INFO: Finished Backup of VM 200 (00:03:31)
INFO: Backup finished at 2024-05-29 23:04:06
INFO: Starting Backup of VM 300 (qemu)
INFO: Backup started at 2024-05-29 23:04:06
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: xxx
INFO: include disk 'scsi0' 'consumer-pool:vm-300-disk-0' 250G
INFO: include disk 'efidisk0' 'consumer-pool:vm-300-disk-1' 1M
INFO: include disk 'tpmstate0' 'consumer-pool:vm-300-disk-2' 4M
INFO: creating Proxmox Backup Server archive 'vm/300/2024-05-29T21:04:06Z'
INFO: starting kvm to execute backup task
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,host=0000:03:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on: vfio 0000:03:00.0: failed to open /dev/vfio/73: Device or resource busy
stopping swtpm instance (pid 138346) due to QEMU startup error
ERROR: Backup of VM 300 failed - start failed: QEMU exited with code 1
INFO: Failed at 2024-05-29 23:04:08
Addition: If I run one of the three VMs with the GPU while the backup job is running, only the VM that does not have the GPU succeeds in creating a backup, which seems like a logical consequence.
Last edited: