qm remote-migrate hung up on fstrim

robm

Member
Jan 9, 2020
19
1
8
51
Latest 7.4 proxmox, was moving a VM from one cluster to another using the "qm remote-migrate" command. First VM moved fine, second one got hung up:

Code:
2023-11-09 09:29:28 stopping NBD storage migration server on target.
tunnel: -> sending command "nbdstop" to remote
tunnel: <- got reply
tunnel: -> sending command "resume" to remote
tunnel: <- got reply
2023-11-09 09:29:28 issuing guest fstrim
tunnel: -> sending command "fstrim" to remote
2023-11-09 09:39:28 ERROR: no reply to command '{"cmd":"fstrim"}': reading from tunnel failed: got timeout
2023-11-09 09:39:28 ERROR: migration finished with problems (duration 00:31:49)

Any known issues that might cause this? I just unlocked the VM on the new cluster and it seemed fine.
 
Hello

Can I see the entire journal please? You can create it with journalctl --since '2023-11-07' > $(hostname)-journal.txt.

Regards
Philipp
 
Old:
Code:
Nov 09 09:07:39 swtrading-pve qm[1298464]: <root@pam> starting task UPID:swtrading-pve:0013D041:00755213:654CF5BB:qmigrate:123:root@pam:
Nov 09 09:39:28 swtrading-pve qm[1298497]: migration problems
Nov 09 09:39:28 swtrading-pve qm[1298504]: migration aborted
Nov 09 09:39:28 swtrading-pve qm[1298464]: <root@pam> end task UPID:swtrading-pve:0013D041:00755213:654CF5BB:qmigrate:123:root@pam: migration problems
Nov 09 09:41:51 swtrading-pve pvedaemon[2937]: VM 123 qmp command failed - VM 123 qmp command 'guest-ping' failed - got timeout

New:
Code:
Nov 09 09:06:35 swtrading2-pve qmeventd[1365]: read: Connection reset by peer
Nov 09 09:06:35 swtrading2-pve pvestatd[1979]: VM 117 qmp command failed - VM 117 not running
Nov 09 09:06:36 swtrading2-pve qmeventd[668145]: Starting cleanup for 117
Nov 09 09:06:36 swtrading2-pve qmeventd[668145]: Finished cleanup for 117
Nov 09 09:36:38 swtrading2-pve pvedaemon[2056]: VM 117 qmp command failed - VM 117 qmp command 'guest-ping' failed - got timeout
Nov 09 09:39:29 swtrading2-pve pvedaemon[670466]: VM 117 qmp command failed - VM 117 qmp command 'guest-fstrim' failed - got timeout
Nov 09 09:39:29 swtrading2-pve pvedaemon[670466]: fstrim failed: VM 117 qmp command 'guest-fstrim' failed - got timeout

We did have an issue with the public network not working, so we had to change the MAC address for the VM manually for it to work. Maybe the public network (eth0/vmbr0) not working caused the timeout?
 
Could I see the entire task log?

task log UPID:swtrading-pve:0013D041:00755213:654CF5BB:qmigrate:123:root@pam:
 
Also, just did another test with a test VM, and the same sort of thing happened. It stopped at the fstrim, but this time it locked up the VM at 100% CPU (1 core), timed out after 10 minutes, and the VM was unresponsive on console. Had top stop/start it, at which point it was running on the new cluster.
 

Attachments

  • task-swtrading-pve-qmigrate-2023-11-09T15_07_39Z.log
    89.1 KB · Views: 3
Hi,
Also, just did another test with a test VM, and the same sort of thing happened. It stopped at the fstrim, but this time it locked up the VM at 100% CPU (1 core), timed out after 10 minutes, and the VM was unresponsive on console. Had top stop/start it, at which point it was running on the new cluster.
how long does a manually issued fstrim in the test VM take? What kind of storage is containerz?

EDIT: Please also share the VM configuration. What kind of physical CPUs do you have on source and target? You might want to try upgrading to kernel 6.2 to see if the issue is gone with that: https://forum.proxmox.com/threads/opt-in-linux-6-2-kernel-for-proxmox-ve-7-x-available.124189
 
Last edited:
Hi,

how long does a manually issued fstrim in the test VM take? What kind of storage is containerz?

EDIT: Please also share the VM configuration. What kind of physical CPUs do you have on source and target? You might want to try upgrading to kernel 6.2 to see if the issue is gone with that: https://forum.proxmox.com/threads/opt-in-linux-6-2-kernel-for-proxmox-ve-7-x-available.124189
fstrim manually takes a few seconds at most (SSD). Storage is ZFS. Each cluster has standalone VMs, both cluster are running latest Proxmox 7.4 with all packages updated. Same CPU in both servers (e3-1230).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!