Backups not running properly

lfmwb

New Member
Jan 3, 2025
5
0
1
Bremerhaven, GER
Dear community,

we're running a setup with 3 nodes, and recently we've been having issues with our backups, they don't consistently complete as expected.

Some days everything runs fine, but on most days, multiple VMs fail with errors like this:
2044: 2025-05-06 23:40:05 ERROR: Backup of VM 2044 failed - VM 2044 qmp command 'backup' failed - got

Our PBS runs on dedicated hardware:
  • 2x E5-2603 v4
  • 32 GB RAM
  • 30 TB HDD (ZFS)
  • 10 GbE

Each of the 3 nodes is configured identically:
  • 2x EPYC 7763
  • 512 GB RAM
  • 30 TB SSD (CEPH)
  • 25 GbE

The backups from today shows this:

VMID Status Time Size Filename
1001 ok 1m 37s 64 GiB vm/1001/2025-05-06T16:30:06Z
2004 ok 3h 59m 46s 1.074 TiB vm/2004/2025-05-06T16:31:43Z
2007 err 3m 27s 0 B
2014 err 2m 22s 0 B
2015 err 3m 1s 0 B
2016 ok 47m 3s 6 GiB vm/2016/2025-05-06T20:40:19Z
2020 ok 38m 31s 32 GiB vm/2020/2025-05-06T21:27:22Z
2042 ok 4h 57m 7s 200 GiB vm/2042/2025-05-06T22:05:53Z
8012 ok 40m 19s 1.953 TiB vm/8012/2025-05-07T03:03:00Z
20003 err 4m 28s 0 B

I can’t find any clear reason for the errors. The only thing that stands out is that the PBS is using HDDs instead of SSDs. Still, I would expect it to just slow down the backups, not cause timeouts, or am I missing something?

Has anyone seen similar behavior or have any ideas what could be causing this?

Any help is greatly appreciated!

Best regards,
Lars Friedrich
 
Some days everything runs fine, but on most days, multiple VMs fail with errors like this:
2044: 2025-05-06 23:40:05 ERROR: Backup of VM 2044 failed - VM 2044 qmp command 'backup' failed - got
Please share the full backup task log... this message seems cut off

The backups from today shows this:

VMID Status Time Size Filename
1001 ok 1m 37s 64 GiB vm/1001/2025-05-06T16:30:06Z
2004 ok 3h 59m 46s 1.074 TiB vm/2004/2025-05-06T16:31:43Z
2007 err 3m 27s 0 B
2014 err 2m 22s 0 B
2015 err 3m 1s 0 B
2016 ok 47m 3s 6 GiB vm/2016/2025-05-06T20:40:19Z
2020 ok 38m 31s 32 GiB vm/2020/2025-05-06T21:27:22Z
2042 ok 4h 57m 7s 200 GiB vm/2042/2025-05-06T22:05:53Z
8012 ok 40m 19s 1.953 TiB vm/8012/2025-05-07T03:03:00Z
20003 err 4m 28s 0 B
Looks like an intermitten failure? Maybe a networking issue in between the PVE node and the PBS instance? Anything of interest in the systemd journal of either, the PVE or PBS instance?
 
Please share the full backup task log... this message seems cut off


Looks like an intermitten failure? Maybe a networking issue in between the PVE node and the PBS instance? Anything of interest in the systemd journal of either, the PVE or PBS instance?
Hey Chris,

thank you for the quick reply.

The full log is:

2044: 2025-05-06 23:37:22 INFO: Starting Backup of VM 2044 (qemu)
2044: 2025-05-06 23:37:22 INFO: status = running
2044: 2025-05-06 23:37:22 INFO: VM Name: XX
2044: 2025-05-06 23:37:22 INFO: include disk 'sata1' 'ceph-storage:vm-2044-disk-1' 192G
2044: 2025-05-06 23:37:22 INFO: include disk 'efidisk0' 'ceph-storage:vm-2044-disk-0' 128K
2044: 2025-05-06 23:37:22 INFO: backup mode: snapshot
2044: 2025-05-06 23:37:22 INFO: ionice priority: 7
2044: 2025-05-06 23:37:22 INFO: creating Proxmox Backup Server archive 'vm/2044/2025-05-06T21:37:22Z'
2044: 2025-05-06 23:37:22 INFO: issuing guest-agent 'fs-freeze' command
2044: 2025-05-06 23:39:31 INFO: issuing guest-agent 'fs-thaw' command
2044: 2025-05-06 23:39:31 ERROR: VM 2044 qmp command 'backup' failed - got timeout
2044: 2025-05-06 23:39:31 INFO: aborting backup job
2044: 2025-05-06 23:40:05 INFO: resuming VM again
2044: 2025-05-06 23:40:05 ERROR: Backup of VM 2044 failed - VM 2044 qmp command 'backup' failed - got timeout

Looks like an intermitten failure? Maybe a networking issue in between the PVE node and the PBS instance? Anything of interest in the systemd journal of either, the PVE or PBS instance?

I don't know, I can reach the server just fine from other devices via different protocols.
 
Is the backup storage/service provided by a guest? I've seen similar things happen when you back up a PBS guest to itself, for example.
 
Last edited:
Hi,

the referenced backup from VM2044 worked fine today.

Code:
2044: 2025-05-08 01:26:12 INFO: Starting Backup of VM 2044 (qemu)
2044: 2025-05-08 01:26:12 INFO: status = running
2044: 2025-05-08 01:26:12 INFO: VM Name: XX
2044: 2025-05-08 01:26:12 INFO: include disk 'sata1' 'ceph-storage:vm-2044-disk-1' 192G
2044: 2025-05-08 01:26:12 INFO: include disk 'efidisk0' 'ceph-storage:vm-2044-disk-0' 128K
2044: 2025-05-08 01:26:12 INFO: backup mode: snapshot
2044: 2025-05-08 01:26:12 INFO: ionice priority: 7
2044: 2025-05-08 01:26:12 INFO: creating Proxmox Backup Server archive 'vm/2044/2025-05-07T23:26:12Z'
2044: 2025-05-08 01:26:12 INFO: issuing guest-agent 'fs-freeze' command
2044: 2025-05-08 01:28:16 INFO: issuing guest-agent 'fs-thaw' command
2044: 2025-05-08 01:28:16 INFO: started backup task 'f7663109-1f60-43c7-b603-714282b9c14f'
2044: 2025-05-08 01:28:16 INFO: resuming VM again
2044: 2025-05-08 01:28:16 INFO: efidisk0: dirty-bitmap status: OK (drive clean)
2044: 2025-05-08 01:28:16 INFO: sata1: dirty-bitmap status: OK (10.5 GiB of 192.0 GiB dirty)
2044: 2025-05-08 01:28:16 INFO: using fast incremental mode (dirty-bitmap), 10.5 GiB dirty of 192.0 GiB total
2044: 2025-05-08 01:28:19 INFO:   3% (408.0 MiB of 10.5 GiB) in 3s, read: 136.0 MiB/s, write: 130.7 MiB/s
2044: 2025-05-08 01:29:11 INFO:   4% (476.0 MiB of 10.5 GiB) in 55s, read: 1.3 MiB/s, write: 1.3 MiB/s
2044: 2025-05-08 01:29:21 INFO:   5% (556.0 MiB of 10.5 GiB) in 1m 5s, read: 8.0 MiB/s, write: 8.0 MiB/s
2044: 2025-05-08 01:29:38 INFO:   6% (648.0 MiB of 10.5 GiB) in 1m 22s, read: 5.4 MiB/s, write: 5.4 MiB/s
2044: 2025-05-08 01:29:57 INFO:   7% (756.0 MiB of 10.5 GiB) in 1m 41s, read: 5.7 MiB/s, write: 5.7 MiB/s
2044: 2025-05-08 01:30:53 INFO:   8% (908.0 MiB of 10.5 GiB) in 2m 37s, read: 2.7 MiB/s, write: 2.7 MiB/s
2044: 2025-05-08 01:30:56 INFO:   9% (972.0 MiB of 10.5 GiB) in 2m 40s, read: 21.3 MiB/s, write: 21.3 MiB/s
2044: 2025-05-08 01:31:18 INFO:  10% (1.1 GiB of 10.5 GiB) in 3m 2s, read: 4.9 MiB/s, write: 4.7 MiB/s
2044: 2025-05-08 01:31:46 INFO:  11% (1.2 GiB of 10.5 GiB) in 3m 30s, read: 4.1 MiB/s, write: 4.0 MiB/s
2044: 2025-05-08 01:32:53 INFO:  12% (1.3 GiB of 10.5 GiB) in 4m 37s, read: 1.9 MiB/s, write: 1.9 MiB/s
2044: 2025-05-08 01:32:58 INFO:  13% (1.4 GiB of 10.5 GiB) in 4m 42s, read: 16.8 MiB/s, write: 16.8 MiB/s
2044: 2025-05-08 01:34:56 INFO:  14% (1.5 GiB of 10.5 GiB) in 6m 40s, read: 902.5 KiB/s, write: 833.1 KiB/s
2044: 2025-05-08 01:34:59 INFO:  17% (1.9 GiB of 10.5 GiB) in 6m 43s, read: 130.7 MiB/s, write: 129.3 MiB/s
2044: 2025-05-08 01:35:41 INFO:  18% (1.9 GiB of 10.5 GiB) in 7m 25s, read: 1.6 MiB/s, write: 1.6 MiB/s
2044: 2025-05-08 01:35:56 INFO:  19% (2.0 GiB of 10.5 GiB) in 7m 40s, read: 5.6 MiB/s, write: 5.6 MiB/s
2044: 2025-05-08 01:36:12 INFO:  20% (2.1 GiB of 10.5 GiB) in 7m 56s, read: 6.2 MiB/s, write: 6.2 MiB/s
2044: 2025-05-08 01:36:25 INFO:  21% (2.2 GiB of 10.5 GiB) in 8m 9s, read: 10.2 MiB/s, write: 9.8 MiB/s
2044: 2025-05-08 01:37:14 INFO:  22% (2.3 GiB of 10.5 GiB) in 8m 58s, read: 1.7 MiB/s, write: 1.7 MiB/s
2044: 2025-05-08 01:37:33 INFO:  23% (2.4 GiB of 10.5 GiB) in 9m 17s, read: 5.9 MiB/s, write: 5.9 MiB/s
2044: 2025-05-08 01:37:47 INFO:  24% (2.5 GiB of 10.5 GiB) in 9m 31s, read: 7.4 MiB/s, write: 7.4 MiB/s
2044: 2025-05-08 01:37:57 INFO:  25% (2.6 GiB of 10.5 GiB) in 9m 41s, read: 10.8 MiB/s, write: 10.8 MiB/s
2044: 2025-05-08 01:38:26 INFO:  26% (2.7 GiB of 10.5 GiB) in 10m 10s, read: 3.7 MiB/s, write: 3.7 MiB/s
2044: 2025-05-08 01:38:38 INFO:  27% (2.9 GiB of 10.5 GiB) in 10m 22s, read: 10.3 MiB/s, write: 10.3 MiB/s
2044: 2025-05-08 01:38:56 INFO:  28% (2.9 GiB of 10.5 GiB) in 10m 40s, read: 5.1 MiB/s, write: 5.1 MiB/s
2044: 2025-05-08 01:39:12 INFO:  29% (3.1 GiB of 10.5 GiB) in 10m 56s, read: 6.8 MiB/s, write: 6.8 MiB/s
2044: 2025-05-08 01:39:54 INFO:  30% (3.2 GiB of 10.5 GiB) in 11m 38s, read: 2.8 MiB/s, write: 2.8 MiB/s
2044: 2025-05-08 01:40:12 INFO:  31% (3.3 GiB of 10.5 GiB) in 11m 56s, read: 5.6 MiB/s, write: 5.6 MiB/s
2044: 2025-05-08 01:40:25 INFO:  32% (3.4 GiB of 10.5 GiB) in 12m 9s, read: 8.9 MiB/s, write: 8.9 MiB/s
2044: 2025-05-08 01:41:47 INFO:  33% (3.5 GiB of 10.5 GiB) in 13m 31s, read: 1.4 MiB/s, write: 1.3 MiB/s
2044: 2025-05-08 01:42:06 INFO:  34% (3.6 GiB of 10.5 GiB) in 13m 50s, read: 6.1 MiB/s, write: 6.1 MiB/s
2044: 2025-05-08 01:42:10 INFO:  35% (3.7 GiB of 10.5 GiB) in 13m 54s, read: 22.0 MiB/s, write: 21.0 MiB/s
2044: 2025-05-08 01:42:15 INFO:  36% (3.8 GiB of 10.5 GiB) in 13m 59s, read: 23.2 MiB/s, write: 23.2 MiB/s
2044: 2025-05-08 01:43:04 INFO:  37% (3.9 GiB of 10.5 GiB) in 14m 48s, read: 2.2 MiB/s, write: 2.2 MiB/s
2044: 2025-05-08 01:43:26 INFO:  38% (4.0 GiB of 10.5 GiB) in 15m 10s, read: 4.7 MiB/s, write: 4.7 MiB/s
2044: 2025-05-08 01:43:37 INFO:  39% (4.1 GiB of 10.5 GiB) in 15m 21s, read: 9.8 MiB/s, write: 9.5 MiB/s
2044: 2025-05-08 01:43:56 INFO:  40% (4.2 GiB of 10.5 GiB) in 15m 40s, read: 5.3 MiB/s, write: 5.1 MiB/s
2044: 2025-05-08 01:45:01 INFO:  41% (4.3 GiB of 10.5 GiB) in 16m 45s, read: 2.0 MiB/s, write: 2.0 MiB/s
2044: 2025-05-08 01:45:07 INFO:  42% (4.4 GiB of 10.5 GiB) in 16m 51s, read: 16.0 MiB/s, write: 16.0 MiB/s
2044: 2025-05-08 01:45:12 INFO:  43% (4.5 GiB of 10.5 GiB) in 16m 56s, read: 20.0 MiB/s, write: 20.0 MiB/s
2044: 2025-05-08 01:45:17 INFO:  44% (4.6 GiB of 10.5 GiB) in 17m 1s, read: 22.4 MiB/s, write: 22.4 MiB/s
2044: 2025-05-08 01:45:41 INFO:  45% (4.8 GiB of 10.5 GiB) in 17m 25s, read: 4.7 MiB/s, write: 4.7 MiB/s
2044: 2025-05-08 01:45:44 INFO:  46% (4.9 GiB of 10.5 GiB) in 17m 28s, read: 34.7 MiB/s, write: 34.7 MiB/s
2044: 2025-05-08 01:46:05 INFO:  47% (4.9 GiB of 10.5 GiB) in 17m 49s, read: 4.8 MiB/s, write: 4.8 MiB/s
2044: 2025-05-08 01:46:33 INFO:  48% (5.1 GiB of 10.5 GiB) in 18m 17s, read: 3.9 MiB/s, write: 3.9 MiB/s
2044: 2025-05-08 01:47:36 INFO:  49% (5.2 GiB of 10.5 GiB) in 19m 20s, read: 1.9 MiB/s, write: 1.8 MiB/s
2044: 2025-05-08 01:47:47 INFO:  50% (5.3 GiB of 10.5 GiB) in 19m 31s, read: 9.5 MiB/s, write: 9.5 MiB/s
2044: 2025-05-08 01:49:46 INFO:  54% (5.7 GiB of 10.5 GiB) in 21m 30s, read: 3.7 MiB/s, write: 3.6 MiB/s
2044: 2025-05-08 01:50:40 INFO:  55% (5.8 GiB of 10.5 GiB) in 22m 24s, read: 1.8 MiB/s, write: 1.7 MiB/s
2044: 2025-05-08 01:51:04 INFO:  56% (5.9 GiB of 10.5 GiB) in 22m 48s, read: 4.7 MiB/s, write: 4.7 MiB/s
2044: 2025-05-08 01:51:15 INFO:  57% (6.0 GiB of 10.5 GiB) in 22m 59s, read: 10.2 MiB/s, write: 9.1 MiB/s
2044: 2025-05-08 01:52:15 INFO:  58% (6.1 GiB of 10.5 GiB) in 23m 59s, read: 1.8 MiB/s, write: 1.8 MiB/s
2044: 2025-05-08 01:52:30 INFO:  59% (6.2 GiB of 10.5 GiB) in 24m 14s, read: 8.0 MiB/s, write: 8.0 MiB/s
2044: 2025-05-08 01:52:36 INFO:  60% (6.3 GiB of 10.5 GiB) in 24m 20s, read: 14.0 MiB/s, write: 13.3 MiB/s
2044: 2025-05-08 01:52:42 INFO:  61% (6.4 GiB of 10.5 GiB) in 24m 26s, read: 21.3 MiB/s, write: 21.3 MiB/s
2044: 2025-05-08 01:53:03 INFO:  62% (6.5 GiB of 10.5 GiB) in 24m 47s, read: 4.4 MiB/s, write: 4.2 MiB/s
2044: 2025-05-08 01:53:11 INFO:  63% (6.6 GiB of 10.5 GiB) in 24m 55s, read: 13.5 MiB/s, write: 13.5 MiB/s
2044: 2025-05-08 01:53:24 INFO:  64% (6.7 GiB of 10.5 GiB) in 25m 8s, read: 8.3 MiB/s, write: 8.3 MiB/s
2044: 2025-05-08 01:53:41 INFO:  65% (6.8 GiB of 10.5 GiB) in 25m 25s, read: 6.1 MiB/s, write: 6.1 MiB/s
2044: 2025-05-08 01:54:21 INFO:  66% (7.0 GiB of 10.5 GiB) in 26m 5s, read: 2.9 MiB/s, write: 2.9 MiB/s
2044: 2025-05-08 01:54:29 INFO:  67% (7.1 GiB of 10.5 GiB) in 26m 13s, read: 13.0 MiB/s, write: 13.0 MiB/s
2044: 2025-05-08 01:54:35 INFO:  68% (7.2 GiB of 10.5 GiB) in 26m 19s, read: 17.3 MiB/s, write: 17.3 MiB/s
2044: 2025-05-08 01:54:47 INFO:  69% (7.3 GiB of 10.5 GiB) in 26m 31s, read: 9.7 MiB/s, write: 9.7 MiB/s
2044: 2025-05-08 01:55:16 INFO:  70% (7.4 GiB of 10.5 GiB) in 27m, read: 4.4 MiB/s, write: 4.4 MiB/s
2044: 2025-05-08 01:55:24 INFO:  71% (7.5 GiB of 10.5 GiB) in 27m 8s, read: 11.0 MiB/s, write: 11.0 MiB/s
2044: 2025-05-08 01:55:50 INFO:  72% (7.6 GiB of 10.5 GiB) in 27m 34s, read: 4.0 MiB/s, write: 3.8 MiB/s
2044: 2025-05-08 01:56:03 INFO:  73% (7.7 GiB of 10.5 GiB) in 27m 47s, read: 8.3 MiB/s, write: 8.3 MiB/s
2044: 2025-05-08 01:57:11 INFO:  74% (7.8 GiB of 10.5 GiB) in 28m 55s, read: 1.6 MiB/s, write: 1.6 MiB/s
2044: 2025-05-08 01:57:18 INFO:  75% (7.9 GiB of 10.5 GiB) in 29m 2s, read: 17.7 MiB/s, write: 17.7 MiB/s
2044: 2025-05-08 01:57:26 INFO:  76% (8.0 GiB of 10.5 GiB) in 29m 10s, read: 11.5 MiB/s, write: 11.5 MiB/s
2044: 2025-05-08 01:57:35 INFO:  77% (8.1 GiB of 10.5 GiB) in 29m 19s, read: 14.2 MiB/s, write: 14.2 MiB/s
2044: 2025-05-08 01:57:48 INFO:  78% (8.2 GiB of 10.5 GiB) in 29m 32s, read: 6.8 MiB/s, write: 6.8 MiB/s
2044: 2025-05-08 01:57:58 INFO:  79% (8.3 GiB of 10.5 GiB) in 29m 42s, read: 11.2 MiB/s, write: 2.0 MiB/s
2044: 2025-05-08 01:58:07 INFO:  80% (8.4 GiB of 10.5 GiB) in 29m 51s, read: 11.6 MiB/s, write: 6.7 MiB/s
2044: 2025-05-08 01:58:53 INFO:  81% (8.5 GiB of 10.5 GiB) in 30m 37s, read: 2.3 MiB/s, write: 2.3 MiB/s
2044: 2025-05-08 01:59:09 INFO:  83% (8.8 GiB of 10.5 GiB) in 30m 53s, read: 15.2 MiB/s, write: 15.2 MiB/s
2044: 2025-05-08 01:59:22 INFO:  84% (8.9 GiB of 10.5 GiB) in 31m 6s, read: 8.0 MiB/s, write: 8.0 MiB/s
2044: 2025-05-08 01:59:49 INFO:  85% (9.0 GiB of 10.5 GiB) in 31m 33s, read: 3.1 MiB/s, write: 3.1 MiB/s
2044: 2025-05-08 01:59:55 INFO:  86% (9.1 GiB of 10.5 GiB) in 31m 39s, read: 18.0 MiB/s, write: 18.0 MiB/s
2044: 2025-05-08 02:00:02 INFO:  87% (9.2 GiB of 10.5 GiB) in 31m 46s, read: 14.9 MiB/s, write: 14.9 MiB/s
2044: 2025-05-08 02:00:06 INFO:  88% (9.3 GiB of 10.5 GiB) in 31m 50s, read: 27.0 MiB/s, write: 27.0 MiB/s
2044: 2025-05-08 02:00:28 INFO:  89% (9.4 GiB of 10.5 GiB) in 32m 12s, read: 4.9 MiB/s, write: 4.9 MiB/s
2044: 2025-05-08 02:00:44 INFO:  90% (9.5 GiB of 10.5 GiB) in 32m 28s, read: 7.8 MiB/s, write: 7.8 MiB/s
2044: 2025-05-08 02:00:51 INFO:  91% (9.6 GiB of 10.5 GiB) in 32m 35s, read: 13.1 MiB/s, write: 13.1 MiB/s
2044: 2025-05-08 02:01:29 INFO:  92% (9.7 GiB of 10.5 GiB) in 33m 13s, read: 2.8 MiB/s, write: 2.8 MiB/s
2044: 2025-05-08 02:01:54 INFO:  93% (9.8 GiB of 10.5 GiB) in 33m 38s, read: 4.5 MiB/s, write: 4.5 MiB/s
2044: 2025-05-08 02:02:09 INFO:  94% (9.9 GiB of 10.5 GiB) in 33m 53s, read: 6.9 MiB/s, write: 6.9 MiB/s
2044: 2025-05-08 02:02:20 INFO:  95% (10.0 GiB of 10.5 GiB) in 34m 4s, read: 11.6 MiB/s, write: 11.6 MiB/s
2044: 2025-05-08 02:02:58 INFO:  96% (10.1 GiB of 10.5 GiB) in 34m 42s, read: 2.7 MiB/s, write: 2.7 MiB/s
2044: 2025-05-08 02:03:01 INFO:  97% (10.2 GiB of 10.5 GiB) in 34m 45s, read: 37.3 MiB/s, write: 37.3 MiB/s
2044: 2025-05-08 02:03:15 INFO:  98% (10.3 GiB of 10.5 GiB) in 34m 59s, read: 6.6 MiB/s, write: 6.6 MiB/s
2044: 2025-05-08 02:03:32 INFO:  99% (10.4 GiB of 10.5 GiB) in 35m 16s, read: 6.4 MiB/s, write: 6.4 MiB/s
2044: 2025-05-08 02:04:05 INFO: 100% (10.5 GiB of 10.5 GiB) in 35m 49s, read: 3.0 MiB/s, write: 3.0 MiB/s
2044: 2025-05-08 02:05:24 INFO: backup was done incrementally, reused 181.71 GiB (94%)
2044: 2025-05-08 02:05:24 INFO: transferred 10.53 GiB in 2228 seconds (4.8 MiB/s)
2044: 2025-05-08 02:05:24 INFO: adding notes to backup
2044: 2025-05-08 02:05:25 INFO: prune older backups with retention: keep-daily=6, keep-monthly=3, keep-weekly=3, keep-yearly=1
2044: 2025-05-08 02:05:25 INFO: running 'proxmox-backup-client prune' for 'vm/2044'
2044: 2025-05-08 02:05:25 INFO: pruned 1 backup(s) not covered by keep-retention policy
2044: 2025-05-08 02:05:25 INFO: Finished Backup of VM 2044 (00:39:13)

From the table at the top of the threat only VM20003 didn't run as expected.

Regards,
Lars Friedrich