Backups Fail and Extreme Slow Speeds

greengolftee87

New Member
Feb 6, 2025
14
0
1
So I've been having an "ongoing" issue with PBS. I'm not virtualizing PBS within PVE, it is a separate machine.
Backups (especially of large VMs') usually fail with the error ERROR: backup write data failed: command error: protocol canceled
Smaller vm's I can retry and it will eventually work but the larger ones never succeed. I though it might be network related but even during the backup I can run iperf3 and get near gig speeds which is the link speed.

Code:
INFO: starting new backup job: vzdump 107 --notes-template '{{guestname}}' --all 0 --node proxmox --mode snapshot --mailnotification always --storage PBS
INFO: Starting Backup of VM 107 (qemu)
INFO: Backup started at 2025-11-11 15:39:18
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: UbuntuWebServices
INFO: include disk 'scsi0' 'local-lvm:vm-107-disk-0' 40G
INFO: include disk 'scsi1' 'local-lvm:vm-107-disk-1' 300G
INFO: creating Proxmox Backup Server archive 'vm/107/2025-11-11T20:39:18Z'
INFO: starting kvm to execute backup task
INFO: started backup task '97ea9034-67bc-41b6-85e3-ba3c81cdc1fc'
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO:   0% (344.0 MiB of 340.0 GiB) in 3s, read: 114.7 MiB/s, write: 105.3 MiB/s
INFO:   1% (3.5 GiB of 340.0 GiB) in 34s, read: 104.5 MiB/s, write: 103.6 MiB/s
INFO:   1% (6.3 GiB of 340.0 GiB) in 16m 39s, read: 3.0 MiB/s, write: 3.0 MiB/s
ERROR: backup write data failed: command error: protocol canceled
INFO: aborting backup job
INFO: stopping kvm after backup task
ERROR: Backup of VM 107 failed - backup write data failed: command error: protocol canceled
INFO: Failed at 2025-11-11 15:55:59
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
TASK ERROR: job errors

As you can see this vm is 340 GB and doesn't make it very far. most google results of this error point to something network related but I havent made any network changes from default on PVE or PBS and I have no other network issues. I've kind of run out of ideas.

The logs on PBS dont say much.
Code:
Nov 11 15:56:43 pbs proxmox-backup-proxy[760]: backup failed: connection error: connection reset
Nov 11 15:56:43 pbs proxmox-backup-proxy[760]: removing failed backup

EDIT:
I should add the backup disk seems fine.
Code:
root@pbs:~# hdparm -Tt /dev/sdb

/dev/sdb:
 Timing cached reads:   11842 MB in  1.99 seconds = 5941.73 MB/sec
 Timing buffered disk reads: 542 MB in  3.00 seconds = 180.61 MB/sec
 
Last edited:
Hi,
please share the corresponding backup task log from the PBS host.
 
Hi,
please share the corresponding backup task log from the PBS host.
Code:
Nov 11 15:39:18 proxmox pvedaemon[1645579]: <root@pam> starting task UPID:proxmox:001993CA:0CEE881D:69139EF6:vzdump:107:root@pam:
Nov 11 15:39:18 proxmox pvedaemon[1676234]: INFO: starting new backup job: vzdump 107 --notes-template '{{guestname}}' --all 0 --node proxmox --mode snapshot --mailnotification always --storage PBS
Nov 11 15:39:18 proxmox pvedaemon[1676234]: INFO: Starting Backup of VM 107 (qemu)
Nov 11 15:39:19 proxmox systemd[1]: Started 107.scope.
Nov 11 15:39:19 proxmox kernel: tap107i0: entered promiscuous mode
Nov 11 15:39:19 proxmox kernel: vmbr1: port 2(fwpr107p0) entered blocking state
Nov 11 15:39:19 proxmox kernel: vmbr1: port 2(fwpr107p0) entered disabled state
Nov 11 15:39:19 proxmox kernel: fwpr107p0: entered allmulticast mode
Nov 11 15:39:19 proxmox kernel: fwpr107p0: entered promiscuous mode
Nov 11 15:39:19 proxmox kernel: vmbr1: port 2(fwpr107p0) entered blocking state
Nov 11 15:39:19 proxmox kernel: vmbr1: port 2(fwpr107p0) entered forwarding state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered blocking state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered disabled state
Nov 11 15:39:19 proxmox kernel: fwln107i0: entered allmulticast mode
Nov 11 15:39:19 proxmox kernel: fwln107i0: entered promiscuous mode
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered blocking state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered forwarding state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 2(tap107i0) entered blocking state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 2(tap107i0) entered disabled state
Nov 11 15:39:19 proxmox kernel: tap107i0: entered allmulticast mode
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 2(tap107i0) entered blocking state
Nov 11 15:39:19 proxmox kernel: fwbr107i0: port 2(tap107i0) entered forwarding state
Nov 11 15:41:51 proxmox pvedaemon[1645579]: worker exit
Nov 11 15:41:51 proxmox pvedaemon[1481]: worker 1645579 finished
Nov 11 15:41:51 proxmox pvedaemon[1481]: starting 1 worker(s)
Nov 11 15:41:51 proxmox pvedaemon[1481]: worker 1677246 started
Nov 11 15:43:32 proxmox postfix/qmgr[1429]: E0106140EBB: from=<>, size=9810, nrcpt=1 (queue active)
Nov 11 15:43:32 proxmox postfix/qmgr[1429]: 0B197140EC6: from=<>, size=7137, nrcpt=1 (queue active)
Nov 11 15:43:32 proxmox postfix/qmgr[1429]: 517D2140EC1: from=<>, size=7195, nrcpt=1 (queue active)
Nov 11 15:43:32 proxmox postfix/local[1677824]: error: open database /etc/aliases.db: No such file or directory
Nov 11 15:43:32 proxmox postfix/local[1677824]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Nov 11 15:43:32 proxmox postfix/local[1677824]: warning: hash:/etc/aliases: lookup of 'root' failed
Nov 11 15:43:32 proxmox postfix/local[1677825]: error: open database /etc/aliases.db: No such file or directory
Nov 11 15:43:32 proxmox postfix/local[1677825]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Nov 11 15:43:32 proxmox postfix/local[1677825]: warning: hash:/etc/aliases: lookup of 'root' failed
Nov 11 15:43:32 proxmox postfix/local[1677824]: E0106140EBB: to=<root@proxmosx.local>, relay=local, delay=2362, delays=2362/0.02/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)
Nov 11 15:43:32 proxmox postfix/local[1677824]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Nov 11 15:43:32 proxmox postfix/local[1677824]: warning: hash:/etc/aliases: lookup of 'root' failed
Nov 11 15:43:32 proxmox postfix/local[1677825]: 0B197140EC6: to=<root@proxmosx.local>, relay=local, delay=1007, delays=1007/0.02/0/0, dsn=4.3.0, status=deferred (alias database unavailable)
Nov 11 15:43:32 proxmox postfix/local[1677824]: 517D2140EC1: to=<root@proxmosx.local>, relay=local, delay=1031, delays=1031/0.02/0/0, dsn=4.3.0, status=deferred (alias database unavailable)
Nov 11 15:45:38 proxmox pveproxy[1661287]: worker exit
Nov 11 15:45:38 proxmox pveproxy[1491]: worker 1661287 finished
Nov 11 15:45:38 proxmox pveproxy[1491]: starting 1 worker(s)
Nov 11 15:45:38 proxmox pveproxy[1491]: worker 1678569 started
Nov 11 15:46:54 proxmox pvedaemon[1668366]: <root@pam> starting task UPID:proxmox:00199E89:0CEF3A42:6913A0BE:vncshell::root@pam:
Nov 11 15:46:54 proxmox pvedaemon[1678985]: starting termproxy UPID:proxmox:00199E89:0CEF3A42:6913A0BE:vncshell::root@pam:
Nov 11 15:46:54 proxmox pveproxy[1662921]: worker exit
Nov 11 15:46:54 proxmox pveproxy[1491]: worker 1662921 finished
Nov 11 15:46:54 proxmox pveproxy[1491]: starting 1 worker(s)
Nov 11 15:46:54 proxmox pveproxy[1491]: worker 1678988 started
Nov 11 15:46:54 proxmox pvedaemon[1677246]: <root@pam> successful auth for user 'root@pam'
Nov 11 15:46:54 proxmox login[1678989]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
Nov 11 15:46:54 proxmox systemd-logind[1095]: New session 677 of user root.
Nov 11 15:46:54 proxmox systemd[1]: Created slice user-0.slice - User Slice of UID 0.
Nov 11 15:46:54 proxmox systemd[1]: Starting user-runtime-dir@0.service - User Runtime Directory /run/user/0...
Nov 11 15:46:54 proxmox systemd[1]: Finished user-runtime-dir@0.service - User Runtime Directory /run/user/0.
Nov 11 15:46:54 proxmox systemd[1]: Starting user@0.service - User Manager for UID 0...
Nov 11 15:46:54 proxmox (systemd)[1678995]: pam_unix(systemd-user:session): session opened for user root(uid=0) by (uid=0)
Nov 11 15:46:54 proxmox systemd[1678995]: Queued start job for default target default.target.
Nov 11 15:46:54 proxmox systemd[1678995]: Created slice app.slice - User Application Slice.
Nov 11 15:46:54 proxmox systemd[1678995]: Reached target paths.target - Paths.
Nov 11 15:46:54 proxmox systemd[1678995]: Reached target timers.target - Timers.
Nov 11 15:46:54 proxmox systemd[1678995]: Listening on dirmngr.socket - GnuPG network certificate management daemon.
Nov 11 15:46:54 proxmox systemd[1678995]: Listening on gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browsers).
Nov 11 15:46:54 proxmox systemd[1678995]: Listening on gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
Nov 11 15:46:54 proxmox systemd[1678995]: Listening on gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
Nov 11 15:46:54 proxmox systemd[1678995]: Listening on gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
Nov 11 15:46:54 proxmox systemd[1678995]: Reached target sockets.target - Sockets.
Nov 11 15:46:54 proxmox systemd[1678995]: Reached target basic.target - Basic System.
Nov 11 15:46:54 proxmox systemd[1678995]: Reached target default.target - Main User Target.
Nov 11 15:46:54 proxmox systemd[1678995]: Startup finished in 124ms.
Nov 11 15:46:54 proxmox systemd[1]: Started user@0.service - User Manager for UID 0.
Nov 11 15:46:54 proxmox systemd[1]: Started session-677.scope - Session 677 of User root.
Nov 11 15:46:54 proxmox login[1679010]: ROOT LOGIN  on '/dev/pts/0'
Nov 11 15:52:55 proxmox pvedaemon[1663213]: worker exit
Nov 11 15:52:55 proxmox pvedaemon[1481]: worker 1663213 finished
Nov 11 15:52:55 proxmox pvedaemon[1481]: starting 1 worker(s)
Nov 11 15:52:55 proxmox pvedaemon[1481]: worker 1681678 started
Nov 11 15:53:25 proxmox pvestatd[1451]: auth key pair too old, rotating..
Nov 11 15:53:32 proxmox postfix/qmgr[1429]: 3CAB8140E8A: from=<>, size=7261, nrcpt=1 (queue active)
Nov 11 15:53:32 proxmox postfix/local[1681853]: error: open database /etc/aliases.db: No such file or directory
Nov 11 15:53:32 proxmox postfix/local[1681853]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Nov 11 15:53:32 proxmox postfix/local[1681853]: warning: hash:/etc/aliases: lookup of 'root' failed
Nov 11 15:53:32 proxmox postfix/local[1681853]: 3CAB8140E8A: to=<root@proxmosx.local>, relay=local, delay=2239, delays=2239/0.03/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)
Nov 11 15:53:40 proxmox pvedaemon[1677246]: <root@pam> successful auth for user 'root@pam'
Nov 11 15:55:58 proxmox kernel: tap107i0: left allmulticast mode
Nov 11 15:55:58 proxmox kernel: fwbr107i0: port 2(tap107i0) entered disabled state
Nov 11 15:55:58 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered disabled state
Nov 11 15:55:58 proxmox kernel: vmbr1: port 2(fwpr107p0) entered disabled state
Nov 11 15:55:58 proxmox kernel: fwln107i0 (unregistering): left allmulticast mode
Nov 11 15:55:58 proxmox kernel: fwln107i0 (unregistering): left promiscuous mode
Nov 11 15:55:58 proxmox kernel: fwbr107i0: port 1(fwln107i0) entered disabled state
Nov 11 15:55:58 proxmox kernel: fwpr107p0 (unregistering): left allmulticast mode
Nov 11 15:55:58 proxmox kernel: fwpr107p0 (unregistering): left promiscuous mode
Nov 11 15:55:58 proxmox kernel: vmbr1: port 2(fwpr107p0) entered disabled state
Nov 11 15:55:59 proxmox qmeventd[1096]: read: Connection reset by peer
Nov 11 15:55:59 proxmox pvedaemon[1676234]: ERROR: Backup of VM 107 failed - backup write data failed: command error: protocol canceled
Nov 11 15:55:59 proxmox pvedaemon[1676234]: INFO: Backup job finished with errors
Nov 11 15:55:59 proxmox pvedaemon[1676234]: job errors
Nov 11 15:55:59 proxmox postfix/pickup[1663694]: 3890E140EC7: uid=0 from=<root>
Nov 11 15:55:59 proxmox postfix/cleanup[1682799]: 3890E140EC7: message-id=<20251111205559.3890E140EC7@proxmosx.local>
Nov 11 15:55:59 proxmox postfix/qmgr[1429]: 3890E140EC7: from=<root@proxmosx.local>, size=5074, nrcpt=1 (queue active)
Nov 11 15:55:59 proxmox systemd[1]: 107.scope: Deactivated successfully.
Nov 11 15:55:59 proxmox systemd[1]: 107.scope: Consumed 30.605s CPU time.
Nov 11 15:55:59 proxmox qmeventd[1682795]: Starting cleanup for 107
Nov 11 15:55:59 proxmox qmeventd[1682795]: Finished cleanup for 107
 
Is MTU the same in all the network path?
MTU isn't specifically called out so I assume it defaults to 1500?

Code:
auto lo
iface lo inet loopback

iface enp4s0 inet manual

auto enp5s0
iface enp5s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.200/24
        gateway 192.168.1.1
        bridge-ports enp4s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr1
iface vmbr1 inet static
        address 192.168.4.200/24
        gateway 192.168.4.1
        bridge-ports enp5s0
        bridge-stp off
        bridge-fd 0
 
As you wrote that PVE and PBS are separate, It can depend on other elements of the network.
I remember some posts in the Forum showing the method to verify it. I'll probably find them again in a few minutes...
Edit: I can't find it now and maybe even it's not needed provided both servers have it 1500...
Is 1500 in PBS as well?
Edit 2: your message about PBS' MTU wasn't shown to me when I was editing my previous post. Sorry for the mess.
 
Last edited:
As you wrote that PVE and PBS are separate, It can depend on other elements of the network.
I remember some posts in the Forum showing the method to verify it. I'll probably find them again in a few minutes...
Looks normal on the PBS as well.

Code:
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
        address 192.168.4.79/24
        gateway 192.168.4.1


source /etc/network/interfaces.d/*
 
So I dont believe this has anything to do with the network at this point. The backup starts off working fine then dies. What network setting would allow expected performance before basically locking up?


1762975220733.png
1762975235008.png
 
Ok I dug some more into iperf3. The retries are sending up a flag for me here. It seems only in 1 direction

Code:
iperf3 Forward
[SUM]   0.00-10.00  sec  1.10 GBytes   945 Mbits/sec  3696             sender
[SUM]   0.00-10.05  sec  1.08 GBytes   927 Mbits/sec                  receiver

iperf3 Reverse
[SUM]   0.00-10.00  sec  1.11 GBytes   950 Mbits/sec    4             sender
[SUM]   0.00-10.00  sec  1.09 GBytes   939 Mbits/sec                  receive
 
Ok I dug some more into iperf3. The retries are sending up a flag for me here. It seems only in 1 direction

Code:
iperf3 Forward
[SUM]   0.00-10.00  sec  1.10 GBytes   945 Mbits/sec  3696             sender
[SUM]   0.00-10.05  sec  1.08 GBytes   927 Mbits/sec                  receiver

iperf3 Reverse
[SUM]   0.00-10.00  sec  1.11 GBytes   950 Mbits/sec    4             sender
[SUM]   0.00-10.00  sec  1.09 GBytes   939 Mbits/sec                  receive
do you have another NIC port you can test with?
is this a LAG on the switch side in play or just a single 1 GB connection?
the previous nic config posted is this for PBS or PVE?

sorry i've read over the previous posts and looking for clarity.

ta
 
do you have another NIC port you can test with?
is this a LAG on the switch side in play or just a single 1 GB connection?
the previous nic config posted is this for PBS or PVE?

sorry i've read over the previous posts and looking for clarity.

ta
1st NIC posting was PVE, 2nd was PBS. just a single gig connection, tried a different ethernet cable. PBS is a dell server with onboard dual gig broadcom nics. Ill dig out a pcie nic and see if that makes a difference but I'm not optimistic.
 
1st NIC posting was PVE, 2nd was PBS. just a single gig connection, tried a different ethernet cable. PBS is a dell server with onboard dual gig broadcom nics. Ill dig out a pcie nic and see if that makes a difference but I'm not optimistic.
its a good route for troubleshooting to eliminate all possible hardware issues, checking another Nic is a good place to start.
i would also check ports on the switch, power cycle the switch if possible, if not and the switch is managed up/ down the port or restart the port.
if you are able to power cycle both servers also give that a go.

what version of PBS and PVE are running?

""G