ProxMox backup job errors / qmp command 'query-backup' failed

AreUSirius

New Member
May 28, 2021
11
0
1
28
Hey there!

since a few days i have problems to do a backup from a vm. Everyday at 10 p.m. it autostarts the backup job but it ends most of the time with an error. I tried snapshot and stopped mode and lzo or zstd compression - backup is stored via nfs to a NAS (TrueNAS). Do you guys know how to fix this ?

found on another forum that open files limit can cause this error. changed the size but still the same error

backup job errors:

Code:
INFO: starting new backup job: vzdump 102 --compress zstd --mode stop --mailnotification always --storage st1-proxmox-backup --quiet 1
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2021-12-14 22:00:02
INFO: status = running
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: xx
INFO: include disk 'virtio0' 'st1-proxmox-vms:102/vm-102-disk-0.qcow2' 50G
INFO: stopping virtual guest
INFO: creating vzdump archive '/mnt/pve/st1-proxmox-backup/dump/vzdump-qemu-102-2021_12_14-22_00_02.vma.zst'
INFO: starting kvm to execute backup task
INFO: started backup task 'e5d1d2d9-18f3-428e-aedc-776b25a2b55b'
INFO: resuming VM again after 16 seconds
INFO:   1% (781.3 MiB of 50.0 GiB) in 3s, read: 260.4 MiB/s, write: 107.9 MiB/s
INFO:   3% (1.9 GiB of 50.0 GiB) in 6s, read: 404.1 MiB/s, write: 27.5 MiB/s
INFO:   4% (2.3 GiB of 50.0 GiB) in 9s, read: 119.6 MiB/s, write: 113.5 MiB/s
INFO:   5% (2.7 GiB of 50.0 GiB) in 12s, read: 120.8 MiB/s, write: 117.6 MiB/s
INFO:   6% (3.5 GiB of 50.0 GiB) in 15s, read: 276.7 MiB/s, write: 74.6 MiB/s
INFO:   9% (4.6 GiB of 50.0 GiB) in 18s, read: 381.8 MiB/s, write: 26.5 MiB/s
INFO:  11% (5.7 GiB of 50.0 GiB) in 21s, read: 397.5 MiB/s, write: 29.5 MiB/s
INFO:  12% (6.2 GiB of 50.0 GiB) in 24s, read: 160.7 MiB/s, write: 151.7 MiB/s
INFO:  13% (6.6 GiB of 50.0 GiB) in 27s, read: 132.9 MiB/s, write: 129.1 MiB/s
INFO:  14% (7.0 GiB of 50.0 GiB) in 31s, read: 108.4 MiB/s, write: 103.0 MiB/s
INFO:  15% (7.6 GiB of 50.0 GiB) in 37s, read: 91.5 MiB/s, write: 87.4 MiB/s
INFO:  16% (8.1 GiB of 50.0 GiB) in 42s, read: 107.3 MiB/s, write: 89.0 MiB/s
INFO:  17% (8.5 GiB of 50.0 GiB) in 47s, read: 89.9 MiB/s, write: 84.4 MiB/s
INFO:  18% (9.0 GiB of 50.0 GiB) in 53s, read: 88.2 MiB/s, write: 84.8 MiB/s
INFO:  19% (9.6 GiB of 50.0 GiB) in 57s, read: 130.5 MiB/s, write: 96.8 MiB/s
INFO:  20% (10.1 GiB of 50.0 GiB) in 1m 1s, read: 132.3 MiB/s, write: 128.4 MiB/s
INFO:  21% (10.5 GiB of 50.0 GiB) in 1m 4s, read: 149.8 MiB/s, write: 136.8 MiB/s
INFO:  22% (11.1 GiB of 50.0 GiB) in 1m 8s, read: 145.0 MiB/s, write: 144.6 MiB/s
INFO:  23% (11.6 GiB of 50.0 GiB) in 1m 11s, read: 175.2 MiB/s, write: 150.6 MiB/s
INFO:  24% (12.1 GiB of 50.0 GiB) in 1m 14s, read: 174.0 MiB/s, write: 157.7 MiB/s
INFO:  25% (12.6 GiB of 50.0 GiB) in 1m 17s, read: 157.1 MiB/s, write: 154.9 MiB/s
INFO:  26% (13.1 GiB of 50.0 GiB) in 1m 21s, read: 141.2 MiB/s, write: 139.7 MiB/s
INFO:  27% (13.6 GiB of 50.0 GiB) in 1m 24s, read: 152.4 MiB/s, write: 111.9 MiB/s
INFO:  28% (14.1 GiB of 50.0 GiB) in 1m 36s, read: 46.5 MiB/s, write: 46.3 MiB/s
INFO:  29% (14.6 GiB of 50.0 GiB) in 1m 40s, read: 118.7 MiB/s, write: 118.5 MiB/s
INFO:  30% (15.0 GiB of 50.0 GiB) in 1m 44s, read: 120.5 MiB/s, write: 117.1 MiB/s
INFO:  31% (15.5 GiB of 50.0 GiB) in 1m 48s, read: 129.6 MiB/s, write: 105.1 MiB/s
INFO:  32% (16.4 GiB of 50.0 GiB) in 1m 51s, read: 297.2 MiB/s, write: 46.0 MiB/s
INFO:  35% (17.5 GiB of 50.0 GiB) in 1m 54s, read: 383.0 MiB/s, write: 27.0 MiB/s
INFO:  37% (18.8 GiB of 50.0 GiB) in 1m 57s, read: 437.4 MiB/s, write: 8.9 MiB/s
INFO:  40% (20.2 GiB of 50.0 GiB) in 2m, read: 466.4 MiB/s, write: 45.3 KiB/s
INFO:  43% (21.7 GiB of 50.0 GiB) in 2m 3s, read: 500.2 MiB/s, write: 184.0 KiB/s
INFO:  46% (23.1 GiB of 50.0 GiB) in 2m 6s, read: 502.7 MiB/s, write: 0 B/s
INFO:  48% (24.4 GiB of 50.0 GiB) in 2m 9s, read: 439.8 MiB/s, write: 16.2 MiB/s
INFO:  49% (24.8 GiB of 50.0 GiB) in 2m 12s, read: 117.2 MiB/s, write: 107.5 MiB/s
INFO:  50% (25.1 GiB of 50.0 GiB) in 2m 15s, read: 108.8 MiB/s, write: 100.6 MiB/s
INFO:  51% (25.5 GiB of 50.0 GiB) in 2m 19s, read: 113.3 MiB/s, write: 105.9 MiB/s
INFO:  52% (26.1 GiB of 50.0 GiB) in 2m 24s, read: 124.0 MiB/s, write: 104.7 MiB/s
INFO:  53% (26.5 GiB of 50.0 GiB) in 2m 27s, read: 140.2 MiB/s, write: 100.8 MiB/s
INFO:  54% (27.1 GiB of 50.0 GiB) in 2m 32s, read: 110.9 MiB/s, write: 107.7 MiB/s
INFO:  55% (27.6 GiB of 50.0 GiB) in 2m 37s, read: 103.7 MiB/s, write: 92.6 MiB/s
INFO:  56% (28.0 GiB of 50.0 GiB) in 2m 41s, read: 110.6 MiB/s, write: 104.1 MiB/s
INFO:  57% (28.6 GiB of 50.0 GiB) in 2m 45s, read: 138.7 MiB/s, write: 93.5 MiB/s
ERROR: VM 102 qmp command 'query-backup' failed - client closed connection
INFO: aborting backup job
ERROR: VM 102 not running
INFO: resuming VM again
trying to acquire lock...
 OK
no lock found trying to remove 'backup'  lock
INFO: restarting vm
ERROR: Backup of VM 102 failed - VM 102 not running
INFO: Failed at 2021-12-14 22:28:33
INFO: Backup job finished with errors
TASK ERROR: job errors

logs from another test:

Code:
Dec 15 10:20:21 hwdeka3 pvedaemon[1222123]: INFO: starting new backup job: vzdump 102 --compress zstd --remove 0 --mode snapshot --node hwdeka3 --storage st1-proxmox-backup
Dec 15 10:20:21 hwdeka3 pvedaemon[1222123]: INFO: Starting Backup of VM 102 (qemu)
Dec 15 10:20:21 hwdeka3 pvedaemon[3996390]: worker exit
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: worker 3996390 finished
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: starting 1 worker(s)
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: worker 1222139 started
Dec 15 10:20:50 hwdeka3 filebeat[12794]: 2021-12-15T10:20:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":106905>
Dec 15 10:21:20 hwdeka3 filebeat[12794]: 2021-12-15T10:21:20.483+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":61440}>
Dec 15 10:21:47 hwdeka3 postfix/qmgr[851724]: DCFF1608B4: from=<root@>, size=10706, nrcpt=1 (queue active)
Dec 15 10:21:47 hwdeka3 nslcd[12897]: [b860a5] <group/member="postfix"> ldap_result() failed: Can't contact LDAP server
Dec 15 10:21:47 hwdeka3 postfix/error[1222451]: DCFF1608B4: to=<>, relay=none, delay=393263, delays=393263/0.02/0/0.01, dsn=4.4.1, status=deferred (delivery temporarily suspended: connect to [10.10.>
Dec 15 10:21:50 hwdeka3 filebeat[12794]: 2021-12-15T10:21:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-98304>
Dec 15 10:22:20 hwdeka3 filebeat[12794]: 2021-12-15T10:22:20.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":139264>
Dec 15 10:22:45 hwdeka3 icinga2[1213134]: [2021-12-15 10:22:45 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
Dec 15 10:22:50 hwdeka3 filebeat[12794]: 2021-12-15T10:22:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-66764>
Dec 15 10:23:20 hwdeka3 filebeat[12794]: 2021-12-15T10:23:20.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-12206>
Dec 15 10:23:25 hwdeka3 icinga2[1213134]: [2021-12-15 10:23:25 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135848] kvm[1221898]: segfault at 10 ip 0000557cd7339d10 sp 00007f2f25be16a8 error 4 in qemu-system-x86_64[557cd6e31000+545000]
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135883] Code: ff ff ff e8 e2 7c af ff 48 8b 0d 8b a1 56 00 ba 21 00 00 00 be 01 00 00 00 48 8d 3d 7a 1c 1d 00 e8 a5 92 af ff e8 d0 a8 af ff <48> 83 7f 10 00 0f 95 c0 c3 0f 1f 80 00 00 00 00 48 8b >
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 qmp command 'query-backup' failed - client closed connection
Dec 15 10:23:39 hwdeka3 kernel: [2831711.184245] fwbr102i0: port 2(tap102i0) entered disabled state
Dec 15 10:23:39 hwdeka3 kernel: [2831711.184397] fwbr102i0: port 2(tap102i0) entered disabled state
Dec 15 10:23:39 hwdeka3 systemd[1]: 102.scope: Succeeded.
Dec 15 10:23:39 hwdeka3 systemd[1]: 102.scope: Consumed 2min 169ms CPU time.
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 not running
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 not running
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: ERROR: Backup of VM 102 failed - VM 102 not running
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: INFO: Backup job finished with errors
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: job errors
Dec 15 10:23:40 hwdeka3 qmeventd[1222835]: Starting cleanup for 102
Dec 15 10:23:40 hwdeka3 kernel: [2831711.860561] fwbr102i0: port 1(fwln102i0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.860669] vmbr0: port 2(fwpr102p0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.862125] device fwln102i0 left promiscuous mode
Dec 15 10:23:40 hwdeka3 kernel: [2831711.862128] fwbr102i0: port 1(fwln102i0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.884456] device fwpr102p0 left promiscuous mode
Dec 15 10:23:40 hwdeka3 kernel: [2831711.884459] vmbr0: port 2(fwpr102p0) entered disabled state
Dec 15 10:23:40 hwdeka3 qmeventd[1222835]: Finished cleanup for 102

pveversion:

Code:
proxmox-ve: 7.1-1 (running kernel: 5.11.22-4-pve)
pve-manager: 7.1-7 (running version: 7.1-7/df5740ad)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph-fuse: 15.2.14-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3

vm (runs debian buster):

Code:
qm status 102 --verbose

balloon: 12886999040
ballooninfo:
        actual: 12886999040
        free_mem: 10927874048
        last_update: 1639561440
        major_page_faults: 3117
        max_mem: 12886999040
        mem_swapped_in: 0
        mem_swapped_out: 0
        minor_page_faults: 3014327
        total_mem: 12602740736
blockstat:
        ide2:
                account_failed: 0
                account_invalid: 0
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 917614785791
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 152
                rd_merged: 0
                rd_operations: 4
                rd_total_time_ns: 80527
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 0
                wr_highest_offset: 0
                wr_merged: 0
                wr_operations: 0
                wr_total_time_ns: 0
        virtio0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 6666
                flush_total_time_ns: 181085740
                idle_time_ns: 1081979725
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 635087872
                rd_merged: 5393
                rd_operations: 24446
                rd_total_time_ns: 13961023686
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 293543936
                wr_highest_offset: 44884332544
                wr_merged: 17
                wr_operations: 21619
                wr_total_time_ns: 99755817049
cpus: 8
disk: 0
diskread: 635088024
diskwrite: 293543936
freemem: 10927874048
maxdisk: 53687091200
maxmem: 12886999040
mem: 1674866688
name: xx
netin: 7243210
netout: 163066439
nics:
        tap102i0:
                netin: 7243210
                netout: 163066439
pid: 1223965
proxmox-support:
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-dirty-bitmap-savevm: 1
        pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
        pbs-masterkey: 1
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-6.1+pve0
running-qemu: 6.1.0
status: running
uptime: 931
vmid: 102
 
Last edited:
It seems your VM crashes:
Code:
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135848] kvm[1221898]: segfault at 10 ip 0000557cd7339d10 sp 00007f2f25be16a8 error 4 in qemu-system-x86_64[557cd6e31000+545000]

Is it always the same VM?
 
It seems your VM crashes:
Code:
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135848] kvm[1221898]: segfault at 10 ip 0000557cd7339d10 sp 00007f2f25be16a8 error 4 in qemu-system-x86_64[557cd6e31000+545000]

Is it always the same VM?

yes its always the same vm. i got an information from another forum that moving the virtual disk to another storage and back can fix the problem. since that the last backup was successful.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!