Live Migration - Fails on a large disk

DamonOnYT

Member
Feb 22, 2021
2
0
21
liftuphosting.com
Hi all,

I am on Proxmox 9.0.10, with 2 servers in my cluster. I have a relatively large VM disk, which fails on live migration every time. The initiating HV is LVM and the destination is ZFS.

I have attached the full migration log, but the error is mainly

channel 4: open failed: connect failed: open failed
2025-10-01 15:35:58 migration status error: failed - Unable to write to socket: Broken pipe
 

Attachments

Hi,

it seems that the error might be caused by the receiving server. Have look at the logs there for any potential problems, e. g. with journalctl --since "2025-10-01 15:35:00" --until "2025-10-01 15:36:30"
 
The log shows the disk mirror starts over NBD via SSH, progresses for minutes, then the channel fails (“open failed … Broken pipe”). That’s classic of a transport drop under sustained load.

Some of the things you can check -

Are MTUs identical end-to-end on the migration path (source NIC, switches, target NIC)? Can you ping -M do -s 8972 (or your MTU minus headers) both ways?

What do iperf3 tests show between the exact migration interfaces? Any packet loss/duplex mismatch?

Any custom MaxSessions / MaxStartups in /etc/ssh/sshd_config on either node? Any sshd logs showing channel failures at the same timestamp?

Is the Proxmox firewall or external firewall in play? If yes, temporarily disable on both nodes to test.

If using VLAN/Jumbo, are offloads (TSO/LRO/GRO) causing issues? Try disabling offloads on both ends to test.

Hope these will help you get some clues.