VM Migration hangs

jeffgott · Feb 1, 2026

I am trying to migrate a VM from one server to another in the same cluster (two servers in the cluster). I am running Proxmox 9.1.4. The VM in question has two disks - disk 0 is 120GB, disk 1 is 350GB. The migration appears to work (although it takes over a day) but appears to hang at the end:

The last message was at 10:16 nd this screen shot was taken about 2 hours later.

I then tried to stop the migration by clicking the Stop button. The server then showed "Loading..." for quite sometime.

Then I had a communication failure:

I rebooted the server and I now get "Connection Refused (595)" message:

I cannot connect to the server even with SSH.

I'm concerned that Proxmox is not the answer for us.

Any thoughts on what is going on?

Impact · Feb 1, 2026

Connect a keyboard and monitor and check journalctl -kr.

jeffgott · Feb 1, 2026

Impact said:
Connect a keyboard and monitor and check journalctl -kr.

I see some errors in the output. Is there something I should be looking for?

I am also using Veeam 13 and it successfully installed the Veeam worker VM on the new Proxmox server. When trying to install the worker on the older server, it fails to create the worker VM - also hangs.

Are these errors telling me that Proxmox is not compatible with the older server? It is an HP DL380 Gen 8 and the server I am migrating the VM to and installing the Veeam worker.

Impact · Feb 1, 2026

With logs you rarely know exactly what you look for. Just share all of it (as text). What hardware do the servers use?

jeffgott · Feb 1, 2026

Impact said:
With logs you rarely know exactly what you look for. Just share all of it (as text). What hardware do the servers use?

The servers are HP DL380 Gen 8. Logs attached.

Impact · Feb 1, 2026

I only have a phone right now but it looks like it just booted? You might have to check the last boot's log. Something like journalctl -b -1 -kr.

jeffgott · Feb 2, 2026

Impact said:
I only have a phone right now but it looks like it just booted? You might have to check the last boot's log. Something like journalctl -b -1 -kr.

Sorry for the delay. I tried to add the veeam worker again and it failed again. Logs attached.

jeffgott · Feb 2, 2026

This was also displayed:

Impact · Feb 2, 2026

That is more interesting.

Bash:

Feb 01 19:28:12 pve3 kernel:  </TASK>
Feb 01 19:28:12 pve3 kernel: R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
Feb 01 19:28:12 pve3 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00006061e5e21080
Feb 01 19:28:12 pve3 kernel: RBP: 000000000000000b R08: 0000000000000000 R09: 0000000000000000
Feb 01 19:28:12 pve3 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000000b
Feb 01 19:28:12 pve3 kernel: RAX: ffffffffffffffda RBX: 0000726d4cc67500 RCX: 0000726d4f0a59ee
Feb 01 19:28:12 pve3 kernel: RSP: 002b:00007ffdc481f778 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
Feb 01 19:28:12 pve3 kernel: RIP: 0033:0x726d4f0a59ee
Feb 01 19:28:12 pve3 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 01 19:28:12 pve3 kernel:  ? exc_page_fault+0x90/0x1b0
Feb 01 19:28:12 pve3 kernel:  ? irqentry_exit+0x43/0x50
Feb 01 19:28:12 pve3 kernel:  ? irqentry_exit_to_user_mode+0x2e/0x290
Feb 01 19:28:12 pve3 kernel:  ? do_user_addr_fault+0x2f8/0x830
Feb 01 19:28:12 pve3 kernel:  ? handle_mm_fault+0x254/0x370
Feb 01 19:28:12 pve3 kernel:  ? count_memcg_events+0xd7/0x1a0
Feb 01 19:28:12 pve3 kernel:  ? __handle_mm_fault+0x62a/0xfd0
Feb 01 19:28:12 pve3 kernel:  ? numa_rebuild_single_mapping.isra.0+0x13f/0x1c0
Feb 01 19:28:12 pve3 kernel:  ? mpol_misplaced+0x69/0x230
Feb 01 19:28:12 pve3 kernel:  ? task_numa_fault+0x68/0xb90
Feb 01 19:28:12 pve3 kernel:  ? node_is_toptier+0x42/0x60
Feb 01 19:28:12 pve3 kernel:  do_syscall_64+0x80/0xa30
Feb 01 19:28:12 pve3 kernel:  x64_sys_call+0x1742/0x2330
Feb 01 19:28:12 pve3 kernel:  __x64_sys_close+0x3e/0x90
Feb 01 19:28:12 pve3 kernel:  fput_close_sync+0x3d/0xa0
Feb 01 19:28:12 pve3 kernel:  __fput+0xed/0x2d0
Feb 01 19:28:12 pve3 kernel:  blkdev_release+0x11/0x20
Feb 01 19:28:12 pve3 kernel:  bdev_release+0x171/0x1b0
Feb 01 19:28:12 pve3 kernel:  filemap_write_and_wait_range+0xd5/0x130
Feb 01 19:28:12 pve3 kernel:  __filemap_fdatawait_range+0x87/0xf0
Feb 01 19:28:12 pve3 kernel:  folio_wait_writeback+0x2b/0xa0
Feb 01 19:28:12 pve3 kernel:  folio_wait_bit+0x18/0x30
Feb 01 19:28:12 pve3 kernel:  ? __pfx_wake_page_function+0x10/0x10
Feb 01 19:28:12 pve3 kernel:  folio_wait_bit_common+0x124/0x2f0
Feb 01 19:28:12 pve3 kernel:  io_schedule+0x4c/0x80
Feb 01 19:28:12 pve3 kernel:  schedule+0x27/0xf0
Feb 01 19:28:12 pve3 kernel:  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Feb 01 19:28:12 pve3 kernel:  __schedule+0x468/0x1310
Feb 01 19:28:12 pve3 kernel:  <TASK>
Feb 01 19:28:12 pve3 kernel: Call Trace:
Feb 01 19:28:12 pve3 kernel: task:qemu-img        state:D stack:0     pid:54031 tgid:54031 ppid:53944  task_flags:0x400100 flags:0x00004002
Feb 01 19:28:12 pve3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 01 19:28:12 pve3 kernel:       Tainted: P          IO        6.17.4-2-pve #1
Feb 01 19:28:12 pve3 kernel: INFO: task qemu-img:54031 blocked for more than 122 seconds.

What kind of storage does 105 use?

Bash:

cat /etc/pve/storage.cfg
lsblk -o+FSTYPE,LABEL,MODEL
qm config 105

jeffgott · Feb 2, 2026

Here are the results:

VM 105 was the VM I was migrating last week. It hung up and never completed. Looks like the disk files were left there. Seems like the issue is when a VM is created thru a migration or a tird party app (like Veeam). I can create a new VM from the browser menu without issue.

Impact · Feb 2, 2026

Hmm. I see no guest volumes on pve3. Might need to run them on the new host for 105. I'm assuming the disks were on local-lvm? I'm not sure why I didn't ask for this before but can you share the migration task log too?

jeffgott · Feb 2, 2026

I'm not sure what you mean by "run them on the new host".
Yes, disks were on local-lvm.

I attempted another deployment of the worker VM for Veeam and it failed again:

The new VM appears for a while and then disappears. The disks are left behind for VM 100 (105 files are from a failed migration):

I attached a zip file that contains a folder with the migration logs.

Impact · Feb 2, 2026

With new host I meant the place where 105 is now but maybe I misunderstood. I'm not familair with veeam but maybe the connection to it has issues in some way? That's a lot of logs. Hard to say which one is the one that needs to be looked at, especially because the ctime/mtime is all the same.
Considering that you cannot SSH to the server too this might just be a general networking issue similar to this: https://forum.proxmox.com/threads/pve-network-kernel-tg3-issue-intermittent-lost-of-network.89485/
E1000E is known for causing issues too but you probably don't use that. You can check via ls -l /sys/class/net/*/device/driver.
I'd likely connect via keyboard and monitor or IPMI if if works and follow journalctl -f while triggering the issue. Maybe something of interest is logged.

jeffgott · Feb 2, 2026

I am able to connect via SSH - it's only after the migration fails that I am unable to use SSH.

Here is the output from the device driver command - doesn't look like a driver issue:

In one of the log files, I see this line that might be the issue:

Any idea how I diagnose "broken pipe"?

Impact · Feb 2, 2026

No, sorry. It's a very generic message and I'm not 100% sure what emits it.

jeffgott · Feb 3, 2026

Impact said:
No, sorry. It's a very generic message and I'm not 100% sure what emits it.

Thanks for your help.

jeffgott · Feb 5, 2026

Ok, I think we've resolved the issue. We were using an HP DL380 G8. I updated iLO and BIOS, the controller software (P420i), deleted the hardware based RAID volume, set HBA mode on, configured the computer to boot from an SD card which is inside the server and installed Proxmox with ZFS mode RAID-Z1.

This thread really helped: https://forum.proxmox.com/threads/hp-smart-array-p420i-hba-mode-complete-guide.142350/

The Veeam worker installed fine. I am now migrating/importing the VM's to the HP DL380 server and the speed is really good.

Thanks for everyone's help.

VM Migration hangs

New Member

Distinguished Member

New Member

Distinguished Member

New Member

Attachments

Distinguished Member

New Member

Attachments

New Member

Distinguished Member

New Member

Distinguished Member

New Member

Attachments

Distinguished Member

New Member

Attachments

Distinguished Member

New Member

New Member

We value your privacy