Live migrate of LXC continues to fail

voarsh

Member
Nov 20, 2020
218
19
23
28
I've tried this multiple times:
It always seems to fail at around 40% with no indication of why.
How can I look into why this happens?

drive-scsi0: transferred: 136469020672 bytes remaining: 198581747712 bytes total: 335050768384 bytes progression: 40.73 % busy: 1 ready: 0
drive-scsi0: Cancelling block job
drive-scsi0: Done.
2020-12-06 16:47:48 ERROR: online migrate failure - mirroring error: drive-scsi0: mirroring has been cancelled
2020-12-06 16:47:48 aborting phase 2 - cleanup resources
2020-12-06 16:47:48 migrate_cancel
2020-12-06 16:47:57 ERROR: migration finished with problems (duration 00:48:17)
TASK ERROR: migration problems

Target VM create:
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
Logical volume "vm-106-disk-0" created.
WARNING: Sum of all thin volume sizes (368.00 GiB) exceeds the size of thin pool pve/data and the size of whole volume group (<232.33 GiB).
migration listens on unix:/run/qemu-server/106.migrate
storage migration listens on nbd:unix:/run/qemu-server/106_nbd.migrate:exportname=drive-scsi0 volume:local-lvm:vm-106-disk-0,format=raw,size=310G
TASK OK
 
Last edited:
I even tried a different migration (restart):




Task viewer: CT 100 - Migrate



2020-12-06 16:53:17 shutdown CT 100
2020-12-06 16:53:22 starting migration of CT 100 to node 'pvedell' (192.168.100.2)
2020-12-06 16:53:22 found local volume 'local-lvm:vm-100-disk-0' (in current VM config)
2020-12-06 16:55:56 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pvedell' root@192.168.100.2 pvesr set-state 100 \''{}'\'
Logical volume "vm-100-disk-0" successfully removed
2020-12-06 16:55:59 start final cleanup
2020-12-06 16:56:00 start container on target node
2020-12-06 16:56:00 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pvedell' root@192.168.100.2 pct start 100
2020-12-06 16:56:01 unable to open file '/var/lib/lxc/100/rules.seccomp.tmp.3974' - No such file or directory
2020-12-06 16:56:01 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pvedell' root@192.168.100.2 pct start 100' failed: exit code 255
2020-12-06 16:56:01 ERROR: migration finished with problems (duration 00:02:44)
TASK ERROR: migration problems


TASK ERROR: unable to open file '/var/lib/lxc/100/rules.seccomp.tmp.3974' - No such file or directory

--

I also get the same issue with an offline migration.
No migration (VM or LXC) seems to actually work.


--
Digging a big further I see that /var/lib/lxc/100/ doesn't contain anything on the target of the migration.
I created these files manually and the LXC starts on the new target.

I do not know what to do about the failing VM that doesn't migrate.
 
Last edited:
do the logs (syslog/journal) or dmesg say anything (on both nodes) ?
whats your pveversion -v ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!