Backups failing for some VMs, while working on others. All have qemu-guest-agent installed.

ajtatum

New Member
Apr 20, 2023
23
0
1
Fairfax, VA
So for reasons I can't figure out, when I run a backup for all my VMs and Containers, some VMs fail and others work fine. All containers backup fine.

The only thing that's different, from what I can parse, is that VMs that were created in Proxmox with a new drive succeed. However, VMs that were imported from Windows Hyper-V fail, even though all the VMs are running Ubuntu 22.04.

Here's an example of a failed backup in the low for VM 102:
Code:
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2023-05-10 09:00:08
INFO: status = running
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: umbreon
INFO: include disk 'scsi0' 'local-2tb:vm-102-disk-0' 64G
INFO: stopping virtual guest
INFO: creating vzdump archive '/mnt/pve/USBBACKUP/dump/vzdump-qemu-102-2023_05_10-09_00_08.vma.zst'
INFO: starting kvm to execute backup task
INFO: restarting vm
INFO: guest is online again after 13 seconds
ERROR: Backup of VM 102 failed - start failed: org.freedesktop.DBus.Error.Disconnected: Connection is closed

Here's the log for a successful VM backup for VM 101:
Code:
INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2023-05-10 08:57:49
INFO: status = running
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: easypanel
INFO: include disk 'scsi0' 'local-2tb:vm-101-disk-0' 128G
INFO: stopping virtual guest
INFO: creating vzdump archive '/mnt/pve/USBBACKUP/dump/vzdump-qemu-101-2023_05_10-08_57_49.vma.zst'
INFO: starting kvm to execute backup task
INFO: started backup task '938d066d-c325-4fed-8703-413077a90b7f'
INFO: resuming VM again after 10 seconds
INFO:   0% (1.2 GiB of 128.0 GiB) in 3s, read: 426.5 MiB/s, write: 391.1 MiB/s
INFO:   1% (2.3 GiB of 128.0 GiB) in 6s, read: 351.6 MiB/s, write: 339.8 MiB/s
INFO:   2% (3.2 GiB of 128.0 GiB) in 9s, read: 308.1 MiB/s, write: 306.7 MiB/s
INFO:   3% (4.2 GiB of 128.0 GiB) in 12s, read: 340.1 MiB/s, write: 338.7 MiB/s
INFO:   4% (5.2 GiB of 128.0 GiB) in 16s, read: 251.5 MiB/s, write: 250.3 MiB/s
INFO:   5% (6.5 GiB of 128.0 GiB) in 21s, read: 283.5 MiB/s, write: 282.3 MiB/s
INFO:   6% (7.9 GiB of 128.0 GiB) in 26s, read: 269.7 MiB/s, write: 266.7 MiB/s
INFO:   7% (9.1 GiB of 128.0 GiB) in 30s, read: 304.5 MiB/s, write: 304.1 MiB/s
INFO:   8% (10.3 GiB of 128.0 GiB) in 34s, read: 309.9 MiB/s, write: 308.9 MiB/s
INFO:   9% (11.8 GiB of 128.0 GiB) in 39s, read: 314.6 MiB/s, write: 310.2 MiB/s
INFO:  10% (12.9 GiB of 128.0 GiB) in 43s, read: 293.6 MiB/s, write: 288.3 MiB/s
INFO:  11% (14.4 GiB of 128.0 GiB) in 48s, read: 293.6 MiB/s, write: 281.2 MiB/s
INFO:  12% (15.5 GiB of 128.0 GiB) in 52s, read: 290.0 MiB/s, write: 283.5 MiB/s
INFO:  13% (16.7 GiB of 128.0 GiB) in 56s, read: 293.8 MiB/s, write: 292.7 MiB/s
INFO:  14% (18.1 GiB of 128.0 GiB) in 1m 1s, read: 295.7 MiB/s, write: 288.4 MiB/s
INFO:  15% (19.3 GiB of 128.0 GiB) in 1m 5s, read: 302.7 MiB/s, write: 301.3 MiB/s
INFO:  16% (20.5 GiB of 128.0 GiB) in 1m 9s, read: 316.3 MiB/s, write: 311.6 MiB/s
INFO:  17% (22.0 GiB of 128.0 GiB) in 1m 14s, read: 300.9 MiB/s, write: 299.9 MiB/s
INFO:  18% (23.4 GiB of 128.0 GiB) in 1m 18s, read: 355.2 MiB/s, write: 355.0 MiB/s
INFO:  19% (24.4 GiB of 128.0 GiB) in 1m 21s, read: 343.2 MiB/s, write: 338.8 MiB/s
INFO:  20% (25.9 GiB of 128.0 GiB) in 1m 25s, read: 387.9 MiB/s, write: 365.0 MiB/s
INFO:  21% (27.1 GiB of 128.0 GiB) in 1m 28s, read: 421.7 MiB/s, write: 380.8 MiB/s
INFO:  22% (28.2 GiB of 128.0 GiB) in 1m 32s, read: 282.8 MiB/s, write: 281.3 MiB/s
INFO:  23% (29.5 GiB of 128.0 GiB) in 1m 36s, read: 314.5 MiB/s, write: 313.1 MiB/s
INFO:  24% (31.0 GiB of 128.0 GiB) in 1m 40s, read: 382.4 MiB/s, write: 272.9 MiB/s
INFO:  25% (32.1 GiB of 128.0 GiB) in 1m 44s, read: 280.0 MiB/s, write: 234.9 MiB/s
INFO:  26% (33.3 GiB of 128.0 GiB) in 1m 48s, read: 325.3 MiB/s, write: 298.7 MiB/s
INFO:  34% (44.3 GiB of 128.0 GiB) in 1m 51s, read: 3.6 GiB/s, write: 145.7 MiB/s
INFO:  46% (59.0 GiB of 128.0 GiB) in 1m 54s, read: 4.9 GiB/s, write: 0 B/s
INFO:  57% (74.2 GiB of 128.0 GiB) in 1m 57s, read: 5.1 GiB/s, write: 0 B/s
INFO:  70% (90.3 GiB of 128.0 GiB) in 2m, read: 5.4 GiB/s, write: 0 B/s
INFO:  81% (104.1 GiB of 128.0 GiB) in 2m 3s, read: 4.6 GiB/s, write: 0 B/s
INFO:  94% (120.6 GiB of 128.0 GiB) in 2m 6s, read: 5.5 GiB/s, write: 0 B/s
INFO: 100% (128.0 GiB of 128.0 GiB) in 2m 8s, read: 3.7 GiB/s, write: 4.0 KiB/s
INFO: backup is sparse: 95.56 GiB (74%) total zero data
INFO: transferred 128.00 GiB in 128 seconds (1.0 GiB/s)
INFO: archive file size: 11.88GB
INFO: adding notes to backup
INFO: prune older backups with retention: keep-daily=7, keep-last=14, keep-monthly=2, keep-weekly=2
INFO: pruned 0 backup(s)
INFO: Finished Backup of VM 101 (00:02:19)
INFO: Backup finished at 2023-05-10 09:00:08

Here's the output of qm config 102:
Code:
agent: 1,fstrim_cloned_disks=1
boot: order=scsi0;net0
cores: 2
cpu: host,flags=+aes
memory: 4096
meta: creation-qemu=7.2.0,ctime=1680372939
name: umbreon
net0: virtio=08:92:04:EA:01:02,bridge=vmbr1,mtu=1
numa: 1
onboot: 1
ostype: l26
scsi0: local-2tb:vm-102-disk-0,cache=writeback,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=0384dbc7-61c5-41c8-bbac-23dde1785624
sockets: 2
startup: order=6
vmgenid: b81d121a-2b06-4110-93fe-4f7453f21b46

Here's the output of qm config 101:
Code:
agent: 1,fstrim_cloned_disks=1
boot: order=scsi0;net0
cores: 2
cpu: host,flags=+aes
memory: 12288
meta: creation-qemu=7.2.0,ctime=1680010904
name: easypanel
net0: virtio=08:92:04:EA:01:01,bridge=vmbr1,mtu=1
numa: 1
onboot: 1
ostype: l26
scsi0: local-2tb:vm-101-disk-0,cache=writeback,iothread=1,size=128G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=2b1e5cb6-f37b-419c-8996-5c8aceb9f9f0
sockets: 2
startup: order=5
vmgenid: c1320554-2119-4556-8886-4d91daa53f4b

For both VMs, I can successfully shutdown and start them and the guest agent runs just fine. I've tried backing up to different paths, above is to an external USB drive, but I've also tried backing up to my Synology over NFS and receive the same errors.

Looking at the server summary during the backup process, the server load doesn't go above 4, there's roughly 100GB of RAM free, and CPU usage is ~7% with a spike to 10% here and there... so, I'm a little confused about what the issue could be.

The only thing that's consistent is the error message for the VMs that fail:

Code:
ERROR: Backup of VM ### failed - start failed: org.freedesktop.DBus.Error.Disconnected: Connection is closed

Any ideas? If more information is needed, please let me know. I sincerely appreciate your help!

Thanks,
AJ
 
Interestingly, if I go to VM 102 that fails in the "mass" or "auto" backup, and go to Backup, I can successfully backup that machine individually.
 
Hi,
please post the output of pveversion -v and the contents of /var/log/syslog from around the time the issue happens.

Do you have the package udisks2 installed by chance? That has been known to cause issues in the past.

You could also monitor the dbus around the time the issue happens, with dbus-monitor --system --profile &> /tmp/dbuslogfile. But best to keep track of the size of the log file, because it might get large quickly if there is a lot going on on the dbus.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!