Context:
I'm running two Proxmox VE nodes—PVE2 at my house and PVE1 at the data center—backing up to a remote Proxmox Backup Server (PBS VPS). Both use Cloudflare WARP Connector for connectivity.
PVE2 backups complete successfully.
PVE1 backups consistently fail partway through.
Looks like packet fragmentation / MTU issues over Cloudflare WARP cause transfers to silently fail, even though control-plane communication works. Odd that PVE2 works fine, but PVE1 fails every time, despite same MTU and config.
Would really appreciate community input or a workaround. Would rather not ditch WARP if I can help it — it’s secure, simple, and works well except for this.
Thanks in advance!
I'm running two Proxmox VE nodes—PVE2 at my house and PVE1 at the data center—backing up to a remote Proxmox Backup Server (PBS VPS). Both use Cloudflare WARP Connector for connectivity.
Network Setup
- PVE2 (Home):
- UDM-Pro in front of pfSense
- WARP MTU: 1280
- No issues backing up
- PVE1 (DC):
- Only pfSense at edge
- Same MTU 1280
- Backups fail mid-transfer with pipelined request failed: timed out error
Troubleshooting Done
- Verified permissions (identical between nodes)
- MTU on WARP interface is 1280 (CloudflareWARP)
- Turned off TSO/GSO/GRO with ethtool (still fails)
- Logs from PBS show:
2025-07-21T22:06:45-06:00: starting new backup on datastore 'PTU-Backup1' from ::ffff:192.168.1.5: "vm/100/2025-07-22T04:06:45Z"
2025-07-21T22:06:46-06:00: GET /previous: 400 Bad Request: no valid previous backup
2025-07-21T22:06:46-06:00: created new fixed index 1 ("vm/100/2025-07-22T04:06:45Z/drive-sata0.img.fidx")
2025-07-21T22:06:46-06:00: add blob "/Backup/PVE1/vm/100/2025-07-22T04:06:45Z/qemu-server.conf.blob" (500 bytes, comp: 500)
2025-07-21T22:08:03-06:00: backup failed: connection error: bytes remaining on stream
2025-07-21T22:08:03-06:00: removing failed backup
2025-07-21T22:08:03-06:00: POST /fixed_chunk: 400 Bad Request: error reading a body from connection: bytes remaining on stream
2025-07-21T22:08:03-06:00: TASK ERROR: connection error: bytes remaining on stream
Logs from PVE1 Show
- INFO: starting new backup job: vzdump 100 --node pve --notes-template '{{guestname}}' --mode snapshot --notification-mode auto --remove 0 --storage PTU-Backup-Server
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2025-07-21 22:06:45
INFO: status = running
INFO: VM Name: Optimus-WOO-Prod-Store
INFO: include disk 'sata0' 'local-lvm:vm-100-disk-0' 120G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2025-07-22T04:06:45Z'
INFO: started backup task 'b78a7df9-60a8-4036-a41b-dc50047bd548'
INFO: resuming VM again
INFO: sata0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 0% (160.0 MiB of 120.0 GiB) in 3s, read: 53.3 MiB/s, write: 33.3 MiB/s
INFO: 0% (160.0 MiB of 120.0 GiB) in 1m 17s, read: 0 B/s, write: 0 B/s
ERROR: backup write data failed: command error: protocol canceled
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - backup write data failed: command error: protocol canceled
INFO: Failed at 2025-07-21 22:08:03
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
TASK ERROR: job errors
- Adjusted pfSense timeouts — no difference
Hypothesis
Looks like packet fragmentation / MTU issues over Cloudflare WARP cause transfers to silently fail, even though control-plane communication works. Odd that PVE2 works fine, but PVE1 fails every time, despite same MTU and config.
My Questions:
- Anyone else seen WARP-related fragmentation issues specifically on Proxmox backups?
- Would clamping MSS on the WARP interface help?
- Is this a known limitation of using WARP for large TCP streams (like VZDUMP image uploads)?
- Are there tunable buffer or TCP keepalive parameters that help with large backup uploads over WARP?
Bonus Info
- PBS server is reachable over WARP on both ends
- ip link confirms WARP MTU is 1280
- I can ping large packets with DF bit set, but only up to 1272 bytes payload
Would really appreciate community input or a workaround. Would rather not ditch WARP if I can help it — it’s secure, simple, and works well except for this.
Thanks in advance!