VM Backup fails on a (Hetzner) SSHFS datastore - broken pipe

hraph

Member
Nov 17, 2020
4
1
23
28
Hello there,

I've got an issue with PBS 3.2 and PVE 8.2 VM Backups to SSHFS Storage.

I'm experiencing continuous backup failures when backing up VMs to a remote SSHFS datastore. Container backups work fine (4GB backups complete successfully), but VM backups always fail with a broken pipe error after transferring around 1-2GB of data.

Error message:
ERROR: backup write data failed: command error: write_data upload error: stream closed because of a broken pipe

Full trace:
Code:
INFO: starting new backup job: vzdump 200 --mode snapshot --notes-template '{{guestname}}' --remove 0 --notification-mode auto --storage backup-server --node xx
INFO: Starting Backup of VM 200 (qemu)
INFO: Backup started at 2025-01-11 00:46:17
INFO: status = running
INFO: VM Name: xxx
INFO: include disk 'scsi0' 'vms:vm-200-disk-0' 52G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/200/2025-01-10T23:46:17Z'
INFO: enabling encryption
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '5d1a0e26-c7ed-44bf-b991-620f04f5f081'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO:   0% (488.0 MiB of 52.0 GiB) in 3s, read: 162.7 MiB/s, write: 153.3 MiB/s
INFO:   1% (684.0 MiB of 52.0 GiB) in 10s, read: 28.0 MiB/s, write: 28.0 MiB/s
INFO:   1% (912.0 MiB of 52.0 GiB) in 13s, read: 76.0 MiB/s, write: 76.0 MiB/s
ERROR: backup write data failed: command error: write_data upload error: stream closed because of a broken pipe
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 200 failed - backup write data failed: command error: write_data upload error: stream closed because of a broken pipe
INFO: Failed at 2025-01-11 00:46:33
INFO: Backup job finished with errors
TASK ERROR: job errors

Current SSHFS mount configuration:
/usr/bin/sshfs -f -o nodev,rw,reconnect,allow_other,uid=0,gid=0,ServerAliveInterval=15,ServerAliveCountMax=3,cache=no hetzner:/ /mnt/hetzner

I've been monitoring the I/O on PBS with iftop and I see a much higher input bandwidth form PVE that the actual upload bandwidth to the remote SSHFS server, so I suspect it's a cache issue.
I've tried:
- Different cache settings (cache=yes/no)
- Adding/removing kernel_cache
- Adding direct_io (generates a ENODEV: No such device error during a backup)

Has anyone encountered similar issues with SSHFS or have an idea?
Thanks in advance
 
Has anyone encountered similar issues with SSHFS or have an idea?
Personally I havn't any experience since I follow the recommendation of PBS developers not to use network storage at all.
It's not supported to use a network file system together with PBS, even on a local network. Over Internet it's even worse, see:


Please note the discussion on the validity of the benchmark approach (part of Proxmox developers had some critical points in this thread: https://forum.proxmox.com/threads/developer-question-about-chunk-folder.148167/ ) but the picture is clear: Network storage is a bad idea for PBS:

For me the error message looks like sshfs and hetzners storagebox don't play nice together for large transfer. To test this you could try whether a backup to a local share or a ssh server on your local network works better.
But in the end this is more an academic exercise than anything else: Abusing sshfs/cifs or nfs for such things isn't recommend so I wouldn't bother.
What I do is this: I have a netcup vserver with Debian and PBS. A sync job on it pulls the new backups from my local PBS vm.
More details on my setup (and why using rclone isn't an option either) I described in an older forum post:
 
Hello, thank you for your reply. I found another post on this forum and I wasn't the only one experiencing this problem. The author suggested backing up locally and using a sync job to send it to the cloud. This works perfectly and I am happy not to use external tools. Thanks for your suggestions