One server always failing the backup

ednt

Active Member
Mar 16, 2017
96
7
28
Hi,

we are on the latest version (2.1-2) and latest kernel (5.3.19-2) and we make many backups to this PBS.
But one server from a Proxmox cluster 7.1-7 always fails.
Ok, it's a big server .

disk-0 100G
disk-1 512G
disk-2 512G
disk-3 512G
disk-4 512G
disk-5 512G

Disks are on CEPH.
Network is 40Gb

All other servers of this cluster are backuped fine.

The result of this server is always:

Code:
2021-12-06T01:14:48+01:00: starting new backup on datastore 'backup-41': "vm/1025701/2021-12-06T00:14:44Z"
2021-12-06T01:14:48+01:00: download 'index.json.blob' from previous backup.
2021-12-06T01:14:48+01:00: register chunks in 'drive-virtio0.img.fidx' from previous backup.
2021-12-06T01:14:48+01:00: download 'drive-virtio0.img.fidx' from previous backup.
2021-12-06T01:14:48+01:00: created new fixed index 1 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio0.img.fidx")
2021-12-06T01:15:10+01:00: register chunks in 'drive-virtio1.img.fidx' from previous backup.
2021-12-06T01:15:10+01:00: download 'drive-virtio1.img.fidx' from previous backup.
2021-12-06T01:15:42+01:00: created new fixed index 2 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio1.img.fidx")
2021-12-06T01:16:10+01:00: register chunks in 'drive-virtio2.img.fidx' from previous backup.
2021-12-06T01:16:10+01:00: download 'drive-virtio2.img.fidx' from previous backup.
2021-12-06T01:19:59+01:00: created new fixed index 3 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio2.img.fidx")
2021-12-06T01:20:29+01:00: register chunks in 'drive-virtio3.img.fidx' from previous backup.
2021-12-06T01:20:29+01:00: download 'drive-virtio3.img.fidx' from previous backup.
2021-12-06T01:20:30+01:00: created new fixed index 4 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio3.img.fidx")
2021-12-06T01:20:57+01:00: register chunks in 'drive-virtio4.img.fidx' from previous backup.
2021-12-06T01:20:57+01:00: download 'drive-virtio4.img.fidx' from previous backup.
2021-12-06T01:20:57+01:00: created new fixed index 5 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio4.img.fidx")
2021-12-06T01:20:57+01:00: register chunks in 'drive-virtio5.img.fidx' from previous backup.
2021-12-06T01:20:57+01:00: download 'drive-virtio5.img.fidx' from previous backup.
2021-12-06T01:20:57+01:00: created new fixed index 6 ("vm/1025701/2021-12-06T00:14:44Z/drive-virtio5.img.fidx")
2021-12-06T01:21:23+01:00: add blob "/mnt/datastore/data-1/backup-41/vm/1025701/2021-12-06T00:14:44Z/qemu-server.conf.blob" (497 bytes, comp: 497)
2021-12-06T01:21:23+01:00: add blob "/mnt/datastore/data-1/backup-41/vm/1025701/2021-12-06T00:14:44Z/fw.conf.blob" (103 bytes, comp: 103)
2021-12-06T01:21:23+01:00: backup ended and finish failed: backup ended but finished flag is not set.
2021-12-06T01:21:23+01:00: removing unfinished backup
2021-12-06T01:21:23+01:00: TASK ERROR: backup ended but finished flag is not set.

Up to now the 512GB disks are nearly empty. They are used as data directory for nextclouds.
3 of them are not in use, but will be start soon and on 2 of them are around 200MB of data

It is planned to add further 512GB disks.
But first the backup needs to be working.

Any ideas?
 
Last edited:

fiona

Proxmox Staff Member
Staff member
Aug 1, 2019
2,676
560
118
Hi,
please also share the corresponding backup log from the Proxmox VE side and the output of
Code:
qm config 1025701
qm status 1025701 --verbose
 

ednt

Active Member
Mar 16, 2017
96
7
28
Hi,

here we go:

Backup log:
Code:
INFO: Starting Backup of VM 1025701 (qemu)
INFO: Backup started at 2021-12-07 01:25:23
INFO: status = running
INFO: VM Name: ispconfig-10257
INFO: include disk 'virtio0' 'hdd-1:vm-1025701-disk-0' 100G
INFO: include disk 'virtio1' 'hdd-1:vm-1025701-disk-1' 512G
INFO: include disk 'virtio2' 'hdd-1:vm-1025701-disk-2' 512G
INFO: include disk 'virtio3' 'hdd-1:vm-1025701-disk-3' 512G
INFO: include disk 'virtio4' 'hdd-1:vm-1025701-disk-4' 512G
INFO: include disk 'virtio5' 'hdd-1:vm-1025701-disk-5' 512G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/1025701/2021-12-07T00:25:23Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: VM 1025701 qmp command 'backup' failed - got timeout
INFO: aborting backup job
ERROR: VM 1025701 qmp command 'backup-cancel' failed - got timeout
INFO: resuming VM again
ERROR: Backup of VM 1025701 failed - VM 1025701 qmp command 'cont' failed - got timeout
INFO: Failed at 2021-12-07 01:38:16
INFO: Starting Backup of VM 1025702 (qemu)
INFO: Backup started at 2021-12-07 01:38:16
INFO: status = running
INFO: VM Name: fw001rz-10257
INFO: include disk 'scsi0' 'hdd-1:vm-1025702-disk-0' 16G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/1025702/2021-12-07T00:38:16Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '464924bf-f2af-4c59-a550-ab332a11cfff'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: OK (68.0 MiB of 16.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 68.0 MiB dirty of 16.0 GiB total
INFO: 100% (68.0 MiB of 68.0 MiB) in 3s, read: 22.7 MiB/s, write: 21.3 MiB/s
INFO: backup was done incrementally, reused 15.94 GiB (99%)
INFO: transferred 68.00 MiB in 38 seconds (1.8 MiB/s)
INFO: Finished Backup of VM 1025702 (00:01:34)
INFO: Backup finished at 2021-12-07 01:39:50
INFO: Backup job finished with errors


Config:
Code:
agent: 1
boot: order=ide2;virtio0
cores: 4
description: ispconfig RF Computer  %0A192.168.249.122  %0A192.168.200.162  %0A45.148.216.122  %0Ahttps://ispconfig-10257.ednt.eu%3A8080
ide2: none,media=cdrom
memory: 16348
name: ispconfig-10257
net0: virtio=2A:90:70:47:2F:95,bridge=vmbr0,firewall=1
net1: virtio=4A:58:8A:81:61:F0,bridge=vmbr1,tag=30
net2: virtio=FE:5C:20:41:CE:FA,bridge=vmbr1,tag=2010
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=e9b4841c-5e98-492f-9d05-9be47879d14a
sockets: 2
virtio0: hdd-1:vm-1025701-disk-0,cache=writethrough,size=100G
virtio1: hdd-1:vm-1025701-disk-1,cache=writethrough,size=512G
virtio2: hdd-1:vm-1025701-disk-2,cache=writethrough,size=512G
virtio3: hdd-1:vm-1025701-disk-3,cache=writethrough,size=512G
virtio4: hdd-1:vm-1025701-disk-4,cache=writethrough,size=512G
virtio5: hdd-1:vm-1025701-disk-5,cache=writethrough,size=512G
vmgenid: 7ef46e4e-bd92-4102-8d78-63c67a87d28

Status:
Code:
balloon: 17142120448
ballooninfo:
        actual: 17142120448
        free_mem: 1258987520
        last_update: 1638861988
        major_page_faults: 13643
        max_mem: 17142120448
        mem_swapped_in: 0
        mem_swapped_out: 0
        minor_page_faults: 6912206081
        total_mem: 16783081472
blockstat:
        ide2:
                account_failed: 0
                account_invalid: 0
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 5151877831603555
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 608
                rd_merged: 0
                rd_operations: 16
                rd_total_time_ns: 880722
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 0
                wr_highest_offset: 0
                wr_merged: 0
                wr_operations: 0
                wr_total_time_ns: 0
        virtio0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 2073964910
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 6307854336
                rd_merged: 52330
                rd_operations: 302901
                rd_total_time_ns: 1382167463214
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 691474313216
                wr_highest_offset: 90195365888
                wr_merged: 126093
                wr_operations: 30955088
                wr_total_time_ns: 2386140102787542
        virtio1:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 399911476088
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 237410304
                rd_merged: 581
                rd_operations: 8538
                rd_total_time_ns: 26033412921
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 10275365888
                wr_highest_offset: 549755813888
                wr_merged: 8669
                wr_operations: 78719
                wr_total_time_ns: 15338806058970
        virtio2:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 1622031284502
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 134835712
                rd_merged: 392
                rd_operations: 5187
                rd_total_time_ns: 14593710610
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 15757280256
                wr_highest_offset: 549755813888
                wr_merged: 11670
                wr_operations: 44848
                wr_total_time_ns: 10093889005498
        virtio3:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 1056532381317
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 130866688
                rd_merged: 375
                rd_operations: 5075
                rd_total_time_ns: 8896805246
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 9917043712
                wr_highest_offset: 549755813888
                wr_merged: 8332
                wr_operations: 29197
                wr_total_time_ns: 7074804820331
        virtio4:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 25139331844851
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 130853376
                rd_merged: 364
                rd_operations: 4948
                rd_total_time_ns: 11844432165
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 9855333376
                wr_highest_offset: 549755813888
                wr_merged: 8387
                wr_operations: 24188
                wr_total_time_ns: 6475066593945
        virtio5:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                flush_operations: 0
                flush_total_time_ns: 0
                idle_time_ns: 25139249818753
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                rd_bytes: 129505792
                rd_merged: 335
                rd_operations: 4873
                rd_total_time_ns: 6079870294
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 9818969088
                wr_highest_offset: 549755813888
                wr_merged: 8293
                wr_operations: 23064
                wr_total_time_ns: 6161838283013
cpus: 8
disk: 0
diskread: 7071326816
diskwrite: 747098305536
freemem: 1258987520
maxdisk: 107374182400
maxmem: 17142120448
mem: 15524093952
name: ispconfig-10257
netin: 20227465408
netout: 7871185988
nics:
        tap1025701i0:
                netin: 1926393925
                netout: 5082169078
        tap1025701i1:
                netin: 18295711176
                netout: 2785361926
        tap1025701i2:
                netin: 5360307
                netout: 3654984
pid: 3720344
proxmox-support:
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-dirty-bitmap-savevm: 1
        pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
        pbs-masterkey: 1
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-6.0+pve0
running-qemu: 6.0.0
status: running
uptime: 5160614
vmid: 1025701

And the qemu guest agent is installed:
Code:
ii  qemu-guest-agent 1:3.1+dfsg-8+deb10u8 amd64        Guest-side qemu-system agent
 
Last edited:

fiona

Proxmox Staff Member
Staff member
Aug 1, 2019
2,676
560
118
Code:
ERROR: VM 1025701 qmp command 'backup' failed - got timeout
INFO: aborting backup job
The reason the backup fails is here, but the question is why the qmp command for setting up the backup job times out. There's two minutes of timeout to connect to the backup server. Does the backup also fail if you trigger it manually?

Code:
running-qemu: 6.0.0
Not sure at all if it would help, but you could try and start/stop or migrate the VM, for it to use QEMU 6.1.0 (assuming that's installed).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!