Hey there!
since a few days i have problems to do a backup from a vm. Everyday at 10 p.m. it autostarts the backup job but it ends most of the time with an error. I tried snapshot and stopped mode and lzo or zstd compression - backup is stored via nfs to a NAS (TrueNAS). Do you guys know how to fix this ?
found on another forum that open files limit can cause this error. changed the size but still the same error
backup job errors:
logs from another test:
pveversion:
vm (runs debian buster):
since a few days i have problems to do a backup from a vm. Everyday at 10 p.m. it autostarts the backup job but it ends most of the time with an error. I tried snapshot and stopped mode and lzo or zstd compression - backup is stored via nfs to a NAS (TrueNAS). Do you guys know how to fix this ?
found on another forum that open files limit can cause this error. changed the size but still the same error
backup job errors:
Code:
INFO: starting new backup job: vzdump 102 --compress zstd --mode stop --mailnotification always --storage st1-proxmox-backup --quiet 1
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2021-12-14 22:00:02
INFO: status = running
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: xx
INFO: include disk 'virtio0' 'st1-proxmox-vms:102/vm-102-disk-0.qcow2' 50G
INFO: stopping virtual guest
INFO: creating vzdump archive '/mnt/pve/st1-proxmox-backup/dump/vzdump-qemu-102-2021_12_14-22_00_02.vma.zst'
INFO: starting kvm to execute backup task
INFO: started backup task 'e5d1d2d9-18f3-428e-aedc-776b25a2b55b'
INFO: resuming VM again after 16 seconds
INFO: 1% (781.3 MiB of 50.0 GiB) in 3s, read: 260.4 MiB/s, write: 107.9 MiB/s
INFO: 3% (1.9 GiB of 50.0 GiB) in 6s, read: 404.1 MiB/s, write: 27.5 MiB/s
INFO: 4% (2.3 GiB of 50.0 GiB) in 9s, read: 119.6 MiB/s, write: 113.5 MiB/s
INFO: 5% (2.7 GiB of 50.0 GiB) in 12s, read: 120.8 MiB/s, write: 117.6 MiB/s
INFO: 6% (3.5 GiB of 50.0 GiB) in 15s, read: 276.7 MiB/s, write: 74.6 MiB/s
INFO: 9% (4.6 GiB of 50.0 GiB) in 18s, read: 381.8 MiB/s, write: 26.5 MiB/s
INFO: 11% (5.7 GiB of 50.0 GiB) in 21s, read: 397.5 MiB/s, write: 29.5 MiB/s
INFO: 12% (6.2 GiB of 50.0 GiB) in 24s, read: 160.7 MiB/s, write: 151.7 MiB/s
INFO: 13% (6.6 GiB of 50.0 GiB) in 27s, read: 132.9 MiB/s, write: 129.1 MiB/s
INFO: 14% (7.0 GiB of 50.0 GiB) in 31s, read: 108.4 MiB/s, write: 103.0 MiB/s
INFO: 15% (7.6 GiB of 50.0 GiB) in 37s, read: 91.5 MiB/s, write: 87.4 MiB/s
INFO: 16% (8.1 GiB of 50.0 GiB) in 42s, read: 107.3 MiB/s, write: 89.0 MiB/s
INFO: 17% (8.5 GiB of 50.0 GiB) in 47s, read: 89.9 MiB/s, write: 84.4 MiB/s
INFO: 18% (9.0 GiB of 50.0 GiB) in 53s, read: 88.2 MiB/s, write: 84.8 MiB/s
INFO: 19% (9.6 GiB of 50.0 GiB) in 57s, read: 130.5 MiB/s, write: 96.8 MiB/s
INFO: 20% (10.1 GiB of 50.0 GiB) in 1m 1s, read: 132.3 MiB/s, write: 128.4 MiB/s
INFO: 21% (10.5 GiB of 50.0 GiB) in 1m 4s, read: 149.8 MiB/s, write: 136.8 MiB/s
INFO: 22% (11.1 GiB of 50.0 GiB) in 1m 8s, read: 145.0 MiB/s, write: 144.6 MiB/s
INFO: 23% (11.6 GiB of 50.0 GiB) in 1m 11s, read: 175.2 MiB/s, write: 150.6 MiB/s
INFO: 24% (12.1 GiB of 50.0 GiB) in 1m 14s, read: 174.0 MiB/s, write: 157.7 MiB/s
INFO: 25% (12.6 GiB of 50.0 GiB) in 1m 17s, read: 157.1 MiB/s, write: 154.9 MiB/s
INFO: 26% (13.1 GiB of 50.0 GiB) in 1m 21s, read: 141.2 MiB/s, write: 139.7 MiB/s
INFO: 27% (13.6 GiB of 50.0 GiB) in 1m 24s, read: 152.4 MiB/s, write: 111.9 MiB/s
INFO: 28% (14.1 GiB of 50.0 GiB) in 1m 36s, read: 46.5 MiB/s, write: 46.3 MiB/s
INFO: 29% (14.6 GiB of 50.0 GiB) in 1m 40s, read: 118.7 MiB/s, write: 118.5 MiB/s
INFO: 30% (15.0 GiB of 50.0 GiB) in 1m 44s, read: 120.5 MiB/s, write: 117.1 MiB/s
INFO: 31% (15.5 GiB of 50.0 GiB) in 1m 48s, read: 129.6 MiB/s, write: 105.1 MiB/s
INFO: 32% (16.4 GiB of 50.0 GiB) in 1m 51s, read: 297.2 MiB/s, write: 46.0 MiB/s
INFO: 35% (17.5 GiB of 50.0 GiB) in 1m 54s, read: 383.0 MiB/s, write: 27.0 MiB/s
INFO: 37% (18.8 GiB of 50.0 GiB) in 1m 57s, read: 437.4 MiB/s, write: 8.9 MiB/s
INFO: 40% (20.2 GiB of 50.0 GiB) in 2m, read: 466.4 MiB/s, write: 45.3 KiB/s
INFO: 43% (21.7 GiB of 50.0 GiB) in 2m 3s, read: 500.2 MiB/s, write: 184.0 KiB/s
INFO: 46% (23.1 GiB of 50.0 GiB) in 2m 6s, read: 502.7 MiB/s, write: 0 B/s
INFO: 48% (24.4 GiB of 50.0 GiB) in 2m 9s, read: 439.8 MiB/s, write: 16.2 MiB/s
INFO: 49% (24.8 GiB of 50.0 GiB) in 2m 12s, read: 117.2 MiB/s, write: 107.5 MiB/s
INFO: 50% (25.1 GiB of 50.0 GiB) in 2m 15s, read: 108.8 MiB/s, write: 100.6 MiB/s
INFO: 51% (25.5 GiB of 50.0 GiB) in 2m 19s, read: 113.3 MiB/s, write: 105.9 MiB/s
INFO: 52% (26.1 GiB of 50.0 GiB) in 2m 24s, read: 124.0 MiB/s, write: 104.7 MiB/s
INFO: 53% (26.5 GiB of 50.0 GiB) in 2m 27s, read: 140.2 MiB/s, write: 100.8 MiB/s
INFO: 54% (27.1 GiB of 50.0 GiB) in 2m 32s, read: 110.9 MiB/s, write: 107.7 MiB/s
INFO: 55% (27.6 GiB of 50.0 GiB) in 2m 37s, read: 103.7 MiB/s, write: 92.6 MiB/s
INFO: 56% (28.0 GiB of 50.0 GiB) in 2m 41s, read: 110.6 MiB/s, write: 104.1 MiB/s
INFO: 57% (28.6 GiB of 50.0 GiB) in 2m 45s, read: 138.7 MiB/s, write: 93.5 MiB/s
ERROR: VM 102 qmp command 'query-backup' failed - client closed connection
INFO: aborting backup job
ERROR: VM 102 not running
INFO: resuming VM again
trying to acquire lock...
OK
no lock found trying to remove 'backup' lock
INFO: restarting vm
ERROR: Backup of VM 102 failed - VM 102 not running
INFO: Failed at 2021-12-14 22:28:33
INFO: Backup job finished with errors
TASK ERROR: job errors
logs from another test:
Code:
Dec 15 10:20:21 hwdeka3 pvedaemon[1222123]: INFO: starting new backup job: vzdump 102 --compress zstd --remove 0 --mode snapshot --node hwdeka3 --storage st1-proxmox-backup
Dec 15 10:20:21 hwdeka3 pvedaemon[1222123]: INFO: Starting Backup of VM 102 (qemu)
Dec 15 10:20:21 hwdeka3 pvedaemon[3996390]: worker exit
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: worker 3996390 finished
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: starting 1 worker(s)
Dec 15 10:20:21 hwdeka3 pvedaemon[13710]: worker 1222139 started
Dec 15 10:20:50 hwdeka3 filebeat[12794]: 2021-12-15T10:20:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":106905>
Dec 15 10:21:20 hwdeka3 filebeat[12794]: 2021-12-15T10:21:20.483+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":61440}>
Dec 15 10:21:47 hwdeka3 postfix/qmgr[851724]: DCFF1608B4: from=<root@>, size=10706, nrcpt=1 (queue active)
Dec 15 10:21:47 hwdeka3 nslcd[12897]: [b860a5] <group/member="postfix"> ldap_result() failed: Can't contact LDAP server
Dec 15 10:21:47 hwdeka3 postfix/error[1222451]: DCFF1608B4: to=<>, relay=none, delay=393263, delays=393263/0.02/0/0.01, dsn=4.4.1, status=deferred (delivery temporarily suspended: connect to [10.10.>
Dec 15 10:21:50 hwdeka3 filebeat[12794]: 2021-12-15T10:21:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-98304>
Dec 15 10:22:20 hwdeka3 filebeat[12794]: 2021-12-15T10:22:20.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":139264>
Dec 15 10:22:45 hwdeka3 icinga2[1213134]: [2021-12-15 10:22:45 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
Dec 15 10:22:50 hwdeka3 filebeat[12794]: 2021-12-15T10:22:50.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-66764>
Dec 15 10:23:20 hwdeka3 filebeat[12794]: 2021-12-15T10:23:20.484+0100#011INFO#011[monitoring]#011log/log.go:184#011Non-zero metrics in the last 30s#011{"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":-12206>
Dec 15 10:23:25 hwdeka3 icinga2[1213134]: [2021-12-15 10:23:25 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135848] kvm[1221898]: segfault at 10 ip 0000557cd7339d10 sp 00007f2f25be16a8 error 4 in qemu-system-x86_64[557cd6e31000+545000]
Dec 15 10:23:39 hwdeka3 kernel: [2831711.135883] Code: ff ff ff e8 e2 7c af ff 48 8b 0d 8b a1 56 00 ba 21 00 00 00 be 01 00 00 00 48 8d 3d 7a 1c 1d 00 e8 a5 92 af ff e8 d0 a8 af ff <48> 83 7f 10 00 0f 95 c0 c3 0f 1f 80 00 00 00 00 48 8b >
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 qmp command 'query-backup' failed - client closed connection
Dec 15 10:23:39 hwdeka3 kernel: [2831711.184245] fwbr102i0: port 2(tap102i0) entered disabled state
Dec 15 10:23:39 hwdeka3 kernel: [2831711.184397] fwbr102i0: port 2(tap102i0) entered disabled state
Dec 15 10:23:39 hwdeka3 systemd[1]: 102.scope: Succeeded.
Dec 15 10:23:39 hwdeka3 systemd[1]: 102.scope: Consumed 2min 169ms CPU time.
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 not running
Dec 15 10:23:39 hwdeka3 pvedaemon[1222123]: VM 102 qmp command failed - VM 102 not running
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: ERROR: Backup of VM 102 failed - VM 102 not running
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: INFO: Backup job finished with errors
Dec 15 10:23:40 hwdeka3 pvedaemon[1222123]: job errors
Dec 15 10:23:40 hwdeka3 qmeventd[1222835]: Starting cleanup for 102
Dec 15 10:23:40 hwdeka3 kernel: [2831711.860561] fwbr102i0: port 1(fwln102i0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.860669] vmbr0: port 2(fwpr102p0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.862125] device fwln102i0 left promiscuous mode
Dec 15 10:23:40 hwdeka3 kernel: [2831711.862128] fwbr102i0: port 1(fwln102i0) entered disabled state
Dec 15 10:23:40 hwdeka3 kernel: [2831711.884456] device fwpr102p0 left promiscuous mode
Dec 15 10:23:40 hwdeka3 kernel: [2831711.884459] vmbr0: port 2(fwpr102p0) entered disabled state
Dec 15 10:23:40 hwdeka3 qmeventd[1222835]: Finished cleanup for 102
pveversion:
Code:
proxmox-ve: 7.1-1 (running kernel: 5.11.22-4-pve)
pve-manager: 7.1-7 (running version: 7.1-7/df5740ad)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph-fuse: 15.2.14-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3
vm (runs debian buster):
Code:
qm status 102 --verbose
balloon: 12886999040
ballooninfo:
actual: 12886999040
free_mem: 10927874048
last_update: 1639561440
major_page_faults: 3117
max_mem: 12886999040
mem_swapped_in: 0
mem_swapped_out: 0
minor_page_faults: 3014327
total_mem: 12602740736
blockstat:
ide2:
account_failed: 0
account_invalid: 0
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 0
flush_total_time_ns: 0
idle_time_ns: 917614785791
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 152
rd_merged: 0
rd_operations: 4
rd_total_time_ns: 80527
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 0
wr_highest_offset: 0
wr_merged: 0
wr_operations: 0
wr_total_time_ns: 0
virtio0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 6666
flush_total_time_ns: 181085740
idle_time_ns: 1081979725
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 635087872
rd_merged: 5393
rd_operations: 24446
rd_total_time_ns: 13961023686
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 293543936
wr_highest_offset: 44884332544
wr_merged: 17
wr_operations: 21619
wr_total_time_ns: 99755817049
cpus: 8
disk: 0
diskread: 635088024
diskwrite: 293543936
freemem: 10927874048
maxdisk: 53687091200
maxmem: 12886999040
mem: 1674866688
name: xx
netin: 7243210
netout: 163066439
nics:
tap102i0:
netin: 7243210
netout: 163066439
pid: 1223965
proxmox-support:
pbs-dirty-bitmap: 1
pbs-dirty-bitmap-migration: 1
pbs-dirty-bitmap-savevm: 1
pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
pbs-masterkey: 1
query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-6.1+pve0
running-qemu: 6.1.0
status: running
uptime: 931
vmid: 102
Last edited: