[SOLVED] Wrong command id?

christophe

Renowned Member
Mar 31, 2011
199
8
83
Hi,

Before a storage rebuild, I launched backups manually.

PVE 7.0-11, PBS 2.0-9.

A VM with 2 disks.

Got this :
INFO: starting new backup job: vzdump 31107 --storage Backup-pbs-r --remove 0 --mode snapshot --node px1
INFO: Starting Backup of VM 31107 (qemu)
INFO: Backup started at 2021-09-15 10:24:52
INFO: status = running
INFO: VM Name: *****-s-st4***7
INFO: include disk 'scsi0' 'Disques-VMs-2:31107/vm-31107-disk-0.raw' 150G
INFO: include disk 'scsi1' 'Disques-VMs-1:vm-31107-disk-1' 500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/31107/2021-09-15T08:24:52Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'c10b2f5f-e249-4f6b-aaff-7124c4b13dc3'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO: 0% (360.0 MiB of 650.0 GiB) in 3s, read: 120.0 MiB/s, write: 9.3 MiB/s
INFO: 1% (6.5 GiB of 650.0 GiB) in 1m 35s, read: 68.7 MiB/s, write: 1.2 MiB/s
INFO: 2% (13.0 GiB of 650.0 GiB) in 3m 16s, read: 65.7 MiB/s, write: 0 B/s
...
INFO: 81% (526.7 GiB of 650.0 GiB) in 3h 8m 20s, read: 109.4 MiB/s, write: 61.7 MiB/s
INFO: 82% (533.5 GiB of 650.0 GiB) in 3h 9m 24s, read: 108.1 MiB/s, write: 16.2 MiB/s
INFO: 83% (539.9 GiB of 650.0 GiB) in 3h 10m 25s, read: 107.9 MiB/s, write: 67.4 MiB/s
INFO: 84% (546.6 GiB of 650.0 GiB) in 3h 11m 28s, read: 108.5 MiB/s, write: 7.0 MiB/s
INFO: 85% (552.9 GiB of 650.0 GiB) in 3h 12m 28s, read: 108.3 MiB/s, write: 25.2 MiB/s
ERROR: VM 31107 qmp command 'query-backup' failed - got wrong command id '2226:7052374' (expected 7748:22720)
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 31107 failed - VM 31107 qmp command 'query-backup' failed - got wrong command id '2226:7052374' (expected 7748:22720)
INFO: Failed at 2021-09-15 13:38:13
INFO: Backup job finished with errors
TASK ERROR: job errors

Where is my mistake??

Christophe.
 
Here :

[root@px1:~]# pveversion -v
proxmox-ve: 7.0-2 (running kernel: 5.4.78-2-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-6
pve-kernel-helper: 7.0-6
pve-kernel-5.4: 6.4-5
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.78-2-pve: 5.4.78-2
ceph-fuse: 14.2.21-1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-10
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
[root@px1:~]#

Christophe.
 
pve-qemu-kvm: 6.0.0-3
Can you try upgrading that to 6.0.0-4 (available on pve-no-subscription repository at time of writing)? That version includes a fix for a race inside QEMU that could trigger the error you encountered.
 
Hi,

Problem occured one more after a "backup now", and again last night with a scheduled backup :

INFO: Starting Backup of VM 31107 (qemu)
INFO: Backup started at 2021-09-15 22:23:12
INFO: status = running
INFO: VM Name: *****-s-st4***7
INFO: include disk 'scsi0' 'Disques-VMs-2:31107/vm-31107-disk-0.raw' 150G
INFO: include disk 'scsi1' 'Disques-VMs-1:vm-31107-disk-1' 500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/31107/2021-09-15T20:23:12Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'd1c24ae1-a04f-49da-b9ef-4084eaa38272'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi1: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 0% (296.0 MiB of 650.0 GiB) in 3s, read: 98.7 MiB/s, write: 9.3 MiB/s
INFO: 1% (6.5 GiB of 650.0 GiB) in 1m 43s, read: 63.9 MiB/s, write: 1.1 MiB/s
...
INFO: 77% (500.6 GiB of 650.0 GiB) in 2h 18m 39s, read: 70.6 MiB/s, write: 862.3 KiB/s
INFO: 78% (507.5 GiB of 650.0 GiB) in 2h 19m 44s, read: 109.2 MiB/s, write: 59.8 MiB/s
ERROR: VM 31107 qmp command 'query-backup' failed - got wrong command id '2226:143880' (expected 30185:26557)
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 31107 failed - VM 31107 qmp command 'query-backup' failed - got wrong command id '2226:143880' (expected 30185:26557)
INFO: Failed at 2021-09-16 00:43:58

Other VMs are backuped without any error.
This VM has a disk on an nfs shared storage, underlying fs is btrfs on a Synolody DSM 7.0.
Other VMs have similar setup.

Christophe.
 
And a move disk fails in a similar manner :

create full clone of drive scsi0 (Disques-VMs-2:31107/vm-31107-disk-0.raw)
Wiping dos signature on /dev/Disques-VMs-1/vm-31107-disk-0.
Logical volume "vm-31107-disk-0" created.
drive mirror is starting for drive-scsi0
drive-scsi0: transferred 358.0 MiB of 150.0 GiB (0.23%) in 4s
drive-scsi0: transferred 754.0 MiB of 150.0 GiB (0.49%) in 8s
drive-scsi0: transferred 1.0 GiB of 150.0 GiB (0.68%) in 10s
drive-scsi0: transferred 1.5 GiB of 150.0 GiB (1.01%) in 16s
...
drive-scsi0: transferred 47.0 GiB of 150.0 GiB (31.31%) in 8m 35s
drive-scsi0: transferred 47.3 GiB of 150.0 GiB (31.53%) in 8m 38s
drive-scsi0: Cancelling block job
drive-scsi0: Done.
Logical volume "vm-31107-disk-0" successfully removed
TASK ERROR: storage migration failed: block job (mirror) error: VM 31107 qmp command 'query-block-jobs' failed - got wrong command id '5776:227' (expected 19437:539)

Christophe.
 
DId you fully restart the VMs after the upgrade, or live-migrated them to a PVE node that already had the new version? As else the VMs still run with the old QEMU...
 
A new bakup is in progress, so looking at the logs from PBS side.

Errors are arising since sept. 6 :

2021-09-06T21:29:03+02:00: starting new backup on datastore 'data': "vm/31107/2021-09-06T19:28:53Z"
2021-09-06T21:29:03+02:00: download 'index.json.blob' from previous backup.
2021-09-06T21:29:03+02:00: register chunks in 'drive-scsi0.img.fidx' from previous backup.
2021-09-06T21:29:03+02:00: download 'drive-scsi0.img.fidx' from previous backup.
2021-09-06T21:29:03+02:00: created new fixed index 1 ("vm/31107/2021-09-06T19:28:53Z/drive-scsi0.img.fidx")
2021-09-06T21:29:03+02:00: register chunks in 'drive-scsi1.img.fidx' from previous backup.
2021-09-06T21:29:03+02:00: download 'drive-scsi1.img.fidx' from previous backup.
2021-09-06T21:29:03+02:00: created new fixed index 2 ("vm/31107/2021-09-06T19:28:53Z/drive-scsi1.img.fidx")
2021-09-06T21:29:03+02:00: add blob "/mnt/datastore/data/vm/31107/2021-09-06T19:28:53Z/qemu-server.conf.blob" (1185 bytes, comp: 1185)
2021-09-06T23:30:16+02:00: backup ended and finish failed: backup ended but finished flag is not set.
2021-09-06T23:30:16+02:00: removing unfinished backup
2021-09-06T23:30:16+02:00: TASK ERROR: backup ended but finished flag is not set.
2021-09-06T23:30:17+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
2021-09-06T23:30:17+02:00: POST /fixed_chunk: 400 Bad Request: backup already marked as finished.
...


Christophe.
 
Yep, seems better now :

INFO: starting new backup job: vzdump 31107 --mode snapshot --storage Backup-pbs-r --remove 0 --node px2
INFO: Starting Backup of VM 31107 (qemu)
INFO: Backup started at 2021-09-16 10:09:48
INFO: status = running
INFO: VM Name: *****-s-st4***7
INFO: include disk 'scsi0' 'Disques-VMs-2:31107/vm-31107-disk-0.raw' 150G
INFO: include disk 'scsi1' 'Disques-VMs-1:vm-31107-disk-1' 500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/31107/2021-09-16T08:09:48Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '6ae6917c-310d-46aa-b5ee-1866a1f72007'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi1: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 0% (244.0 MiB of 650.0 GiB) in 3s, read: 81.3 MiB/s, write: 10.7 MiB/s
INFO: 1% (6.6 GiB of 650.0 GiB) in 1m 37s, read: 69.1 MiB/s, write: 7.4 MiB/s
...
INFO: 98% (637.7 GiB of 650.0 GiB) in 2h 52m 12s, read: 108.8 MiB/s, write: 0 B/s
INFO: 99% (644.3 GiB of 650.0 GiB) in 2h 53m 14s, read: 109.0 MiB/s, write: 0 B/s
INFO: 100% (650.0 GiB of 650.0 GiB) in 2h 54m 9s, read: 106.1 MiB/s, write: 0 B/s
INFO: backup is sparse: 218.79 GiB (33%) total zero data
INFO: backup was done incrementally, reused 594.45 GiB (91%)
INFO: transferred 650.00 GiB in 10449 seconds (63.7 MiB/s)
INFO: Finished Backup of VM 31107 (02:54:20)
INFO: Backup finished at 2021-09-16 13:04:08
INFO: Backup job finished successfully
TASK OK

apt-get update && apt-get dist-upgrade did the job!

Thank you,

Christophe.