What NICs are used for the routes the backup traffic takes? There were similar reports where TCP segmentation offloading to the NIC firmware was at fault, see
https://forum.proxmox.com/threads/pbs-sync-failed-each-time.113921/#post-573939. Although in these cases the backup failed with errors, not stalling as you describe.
First of, please post the
pveversion -v and
proxmox-backup-manager version --verbose. Further, check the system journal on both sides for errors around the time the stalled backup occurs. While the backup is stall, can you ping the PBS from the PVE host?
Hi,
the NIC used on the PVE node is a BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller by Broadcom.
Unfortunately I can't tell which one is used on the PBS side as it is running within a KVM.
But - I can tell that creating backups of VMs works with the same PBS with other PVE nodes, even with the same hardware specs.
A couple days ago I had that problem already on another PVE node, now I upgraded the server to a completely new one with a freshly installed PVE instance. The only thing I did is migrating the VMs over from the old node to the new one. I migrated the VM-disks as well as the config files manually.
I've already tried so many things, nothing works.
I checked the journal on both the PVE and PBS nodes during the stalled backup – no obvious errors on either side, except for occasional connection resets on the PBS side.
While the backup is stalled, the ping to the PBS from the PVE host
does not work when using packet sizes above ~1200 bytes with -M do. It shows message too long errors, even though MTU is set to 1500 on both ends. Smaller packets go through.
Regular traffic (SSH, web UI, etc.) works fine. Only during the backup does the connection appear to drop or degrade significantly.
This is pveversion -v:
Code:
root@root428:~# pveversion -v
proxmox-ve: 8.4.0 (running kernel: 6.8.12-11-pve)
pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8: 6.8.12-11
ceph-fuse: 16.2.15+ds-0+deb12u1
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx11
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.0
libpve-cluster-perl: 8.1.0
libpve-common-perl: 8.3.1
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.1-1
proxmox-backup-file-restore: 3.4.1-1
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.11
pve-cluster: 8.1.0
pve-container: 5.2.6
pve-docs: 8.4.0
pve-edk2-firmware: not correctly installed
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.1
pve-firmware: 3.15-4
pve-ha-manager: 4.0.7
pve-i18n: 3.4.4
pve-qemu-kvm: 9.2.0-5
pve-xtermjs: 5.5.0-2
qemu-server: 8.3.12
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2
And this is proxmox-backup-manager version --verbose:
Code:
root@pbs45:~# proxmox-backup-manager version --verbose
proxmox-backup 3.4.0 running kernel: 6.8.12-11-pve
proxmox-backup-server 3.4.1-1 running version: 3.4.1
proxmox-kernel-helper 8.1.1
proxmox-kernel-6.8.12-11-pve-signed 6.8.12-11
proxmox-kernel-6.8 6.8.12-11
proxmox-kernel-6.8.12-9-pve-signed 6.8.12-9
ifupdown2 3.2.0-1+pmx11
libjs-extjs 7.0.0-5
proxmox-backup-docs 3.4.1-1
proxmox-backup-client 3.4.1-1
proxmox-mail-forward 0.3.2
proxmox-mini-journalreader 1.4.0
proxmox-offline-mirror-helper 0.6.7
proxmox-widget-toolkit 4.3.11
pve-xtermjs 5.5.0-2
smartmontools 7.3-pve1
zfsutils-linux 2.2.7-pve2
Thanks in advance