Backup hangs infenitily since upgrade from 13 to 14.1

Feb 4, 2024
102
13
18
Hello All,

we mostly suffer on hanging backups since this morning on VMS with GPU Passthrough and they have been passthrouhged once.

1764592640776.png
1764592655170.png
is there any known limiattions in 4.1. in that constellation?

Running Proxmox still on 8.4.
proxmox-ve: 8.4.0 (running kernel: 6.8.12-15-pve)pve-manager: 8.4.14 (running version: 8.4.14/b502d23c55afcba1)proxmox-kernel-helper: 8.1.4proxmox-kernel-6.8: 6.8.12-15proxmox-kernel-6.8.12-15-pve-signed: 6.8.12-15proxmox-kernel-6.8.12-13-pve-signed: 6.8.12-13proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6proxmox-kernel-6.5: 6.5.13-6proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8ceph-fuse: 19.2.1-pve3corosync: 3.1.9-pve1criu: 3.17.1-2+deb12u2glusterfs-client: 10.3-5ifupdown2: 3.2.0-1+pmx11ksm-control-daemon: 1.5-1libjs-extjs: 7.0.0-5libknet1: 1.30-pve2libproxmox-acme-perl: 1.6.0libproxmox-backup-qemu0: 1.5.2libproxmox-rs-perl: 0.3.5libpve-access-control: 8.2.2libpve-apiclient-perl: 3.3.2libpve-cluster-api-perl: 8.1.2libpve-cluster-perl: 8.1.2libpve-common-perl: 8.3.4libpve-guest-common-perl: 5.2.2libpve-http-server-perl: 5.2.2libpve-network-perl: 0.11.2libpve-rs-perl: 0.9.4libpve-storage-perl: 8.3.7libspice-server1: 0.15.1-1lvm2: 2.03.16-2lxc-pve: 6.0.0-1lxcfs: 6.0.0-pve2novnc-pve: 1.6.0-2proxmox-backup-client: 3.4.6-1proxmox-backup-file-restore: 3.4.6-1proxmox-backup-restore-image: 0.7.0proxmox-firewall: 0.7.1proxmox-kernel-helper: 8.1.4proxmox-mail-forward: 0.3.3proxmox-mini-journalreader: 1.5proxmox-offline-mirror-helper: 0.6.8proxmox-widget-toolkit: 4.3.13pve-cluster: 8.1.2pve-container: 5.3.3pve-docs: 8.4.1pve-edk2-firmware: 4.2025.02-4~bpo12+1pve-esxi-import-tools: 0.7.4pve-firewall: 5.1.2pve-firmware: 3.16-3pve-ha-manager: 4.0.7pve-i18n: 3.4.5pve-qemu-kvm: 9.2.0-7+vitastor2pve-xtermjs: 5.5.0-2qemu-server: 8.4.3smartmontools: 7.3-pve1spiceterm: 3.3.1swtpm: 0.8.0+pve1vncterm: 1.8.1zfsutils-linux: 2.2.8-pve1
 
Hello All,

we mostly suffer on hanging backups since this morning on VMS with GPU Passthrough and they have been passthrouhged once.

View attachment 93496
View attachment 93497
is there any known limiattions in 4.1. in that constellation?

Running Proxmox still on 8.4.
proxmox-ve: 8.4.0 (running kernel: 6.8.12-15-pve)pve-manager: 8.4.14 (running version: 8.4.14/b502d23c55afcba1)proxmox-kernel-helper: 8.1.4proxmox-kernel-6.8: 6.8.12-15proxmox-kernel-6.8.12-15-pve-signed: 6.8.12-15proxmox-kernel-6.8.12-13-pve-signed: 6.8.12-13proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6proxmox-kernel-6.5: 6.5.13-6proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8ceph-fuse: 19.2.1-pve3corosync: 3.1.9-pve1criu: 3.17.1-2+deb12u2glusterfs-client: 10.3-5ifupdown2: 3.2.0-1+pmx11ksm-control-daemon: 1.5-1libjs-extjs: 7.0.0-5libknet1: 1.30-pve2libproxmox-acme-perl: 1.6.0libproxmox-backup-qemu0: 1.5.2libproxmox-rs-perl: 0.3.5libpve-access-control: 8.2.2libpve-apiclient-perl: 3.3.2libpve-cluster-api-perl: 8.1.2libpve-cluster-perl: 8.1.2libpve-common-perl: 8.3.4libpve-guest-common-perl: 5.2.2libpve-http-server-perl: 5.2.2libpve-network-perl: 0.11.2libpve-rs-perl: 0.9.4libpve-storage-perl: 8.3.7libspice-server1: 0.15.1-1lvm2: 2.03.16-2lxc-pve: 6.0.0-1lxcfs: 6.0.0-pve2novnc-pve: 1.6.0-2proxmox-backup-client: 3.4.6-1proxmox-backup-file-restore: 3.4.6-1proxmox-backup-restore-image: 0.7.0proxmox-firewall: 0.7.1proxmox-kernel-helper: 8.1.4proxmox-mail-forward: 0.3.3proxmox-mini-journalreader: 1.5proxmox-offline-mirror-helper: 0.6.8proxmox-widget-toolkit: 4.3.13pve-cluster: 8.1.2pve-container: 5.3.3pve-docs: 8.4.1pve-edk2-firmware: 4.2025.02-4~bpo12+1pve-esxi-import-tools: 0.7.4pve-firewall: 5.1.2pve-firmware: 3.16-3pve-ha-manager: 4.0.7pve-i18n: 3.4.5pve-qemu-kvm: 9.2.0-7+vitastor2pve-xtermjs: 5.5.0-2qemu-server: 8.4.3smartmontools: 7.3-pve1spiceterm: 3.3.1swtpm: 0.8.0+pve1vncterm: 1.8.1zfsutils-linux: 2.2.8-pve1
same here...

probably same issue reported here :

 
strange is we have it only on some limited numbers of VMs, on 5 out of 6 nodes it went through fine. strange enought, so seems to be not a general problem with all VMS. on all linux based ones didnt had a problem mostly Win11 with EFI disk and gpu passthrough
 
tried also with q35 with version 9 instead of 9.2. as It seemed to have more success with that but still stuck:




INFO: starting new backup job: vzdump 180 --mode snapshot --storage PBS1 --node pve1 --remove 0 --notification-mode auto --notes-template '{{guestname}}, {{node}}, {{vmid}}'
INFO: Starting Backup of VM 180 (qemu)
INFO: Backup started at 2025-12-01 13:51:06
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: laser-pve1
INFO: include disk 'virtio0' 'vita_vm:vm-180-disk-1' 120G
INFO: include disk 'efidisk0' 'vita_vm:vm-180-disk-0' 528K
INFO: creating Proxmox Backup Server archive 'vm/180/2025-12-01T12:51:06Z'
INFO: starting kvm to execute backup task
[OSD 0] RDMA initialized successfully
[OSD 0] RDMA initialized successfully
INFO: started backup task '86b14375-286e-4316-84f1-04cafa93edaf'
INFO: efidisk0: dirty-bitmap status: created new
INFO: virtio0: dirty-bitmap status: created new
INFO: 0% (624.0 MiB of 120.0 GiB) in 3s, read: 208.0 MiB/s, write: 166.7 MiB/s
INFO: 1% (2.1 GiB of 120.0 GiB) in 6s, read: 505.3 MiB/s, write: 334.7 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job
INFO: stopping kvm after backup task
trying to acquire lock...
OK
ERROR: Backup of VM 180 failed - interrupted by signal
INFO: Failed at 2025-12-01 13:55:10
ERROR: Backup job failed - interrupted by signal
INFO: notified via target `DMZ`
TASK ERROR: interrupted by signa
 
after restart of PBS RAM is down very low, but doesnt solve problem of hanging Backup.. strange now also on VMs with qEMU 9.0 version
revert to kernel 6.14.11-4-pve on PBS with:

proxmox-boot-tool kernel pin 6.14.11-4-pve --next-boot

and backups wil not hang anymore

actually this is the only temporary fix for this issue.