[SOLVED] LXC migration fails on ZFS

grobs · Aug 31, 2018

Hi,

I'm trying to cold-migrate an LXC container from a host to another.
The LXC I want to migrate is located on a "local-zfs" pool created on the first node and enabled on every cluster node.

Here is the migration log:

Code:

2018-08-31 10:12:54 starting migration of CT 162 to node 'proxmox5-staging-02' (192.168.10.51)
2018-08-31 10:12:54 found local volume 'local-zfs-pm502:subvol-162-disk-1' (via storage)
2018-08-31 10:12:54 found local volume 'local-zfs:subvol-162-disk-1' (in current VM config)
full send of rpool/data/subvol-162-disk-1@__migration__ estimated size is 1.21G
total estimated size is 1.21G
TIME SENT SNAPSHOT
10:12:55 96.3M rpool/data/subvol-162-disk-1@__migration__
10:12:56 207M rpool/data/subvol-162-disk-1@__migration__
10:12:57 319M rpool/data/subvol-162-disk-1@__migration__
10:12:58 430M rpool/data/subvol-162-disk-1@__migration__
10:12:59 541M rpool/data/subvol-162-disk-1@__migration__
10:13:00 653M rpool/data/subvol-162-disk-1@__migration__
10:13:01 761M rpool/data/subvol-162-disk-1@__migration__
10:13:02 872M rpool/data/subvol-162-disk-1@__migration__
10:13:03 982M rpool/data/subvol-162-disk-1@__migration__
10:13:04 1.07G rpool/data/subvol-162-disk-1@__migration__
10:13:05 1.15G rpool/data/subvol-162-disk-1@__migration__
10:13:06 1.24G rpool/data/subvol-162-disk-1@__migration__
full send of rpool/data/subvol-162-disk-1@__migration__ estimated size is 1.21G
total estimated size is 1.21G
TIME SENT SNAPSHOT
rpool/data/subvol-162-disk-1 name rpool/data/subvol-162-disk-1 -
volume 'rpool/data/subvol-162-disk-1' already exists
command 'zfs send -Rpv -- rpool/data/subvol-162-disk-1@__migration__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2018-08-31 10:13:07 ERROR: command 'set -o pipefail && pvesm export local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox5-staging-02' root@192.168.10.51 -- pvesm import local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
2018-08-31 10:13:07 aborting phase 1 - cleanup resources
2018-08-31 10:13:07 ERROR: found stale volume copy 'local-zfs-pm502:subvol-162-disk-1' on node 'proxmox5-staging-02'
2018-08-31 10:13:07 ERROR: found stale volume copy 'local-zfs:subvol-162-disk-1' on node 'proxmox5-staging-02'
2018-08-31 10:13:07 start final cleanup
2018-08-31 10:13:07 ERROR: migration aborted (duration 00:00:13): command 'set -o pipefail && pvesm export local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox5-staging-02' root@192.168.10.51 -- pvesm import local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
TASK ERROR: migration aborted

It seems that it finds the container's disk in two different ZFS pools, which is not the case (it's only in local-zfs).

Code:

2018-08-31 10:12:54 found local volume 'local-zfs-pm502:subvol-162-disk-1' (via storage)
2018-08-31 10:12:54 found local volume 'local-zfs:subvol-162-disk-1' (in current VM config)

Do you have any clue?

Regards

wolfgang · Aug 31, 2018

Hi,

you have two storages which point to the same dataset this is not working.

grobs · Aug 31, 2018

Yes, you're right, I figured it out too.
I removed the unnecessary storages, now I have only local and local-zfs on the two nodes.

Now, when I try to migrate, I obtain this error:

Code:

2018-08-31 11:30:22 starting migration of CT 162 to node 'proxmox5-staging-02' (192.168.10.51)
2018-08-31 11:30:23 found local volume 'local-zfs:subvol-162-disk-1' (in current VM config)
full send of rpool/data/subvol-162-disk-1@__migration__ estimated size is 1.22G
total estimated size is 1.22G
TIME SENT SNAPSHOT
rpool/data/subvol-162-disk-1 name rpool/data/subvol-162-disk-1 -
volume 'rpool/data/subvol-162-disk-1' already exists
command 'zfs send -Rpv -- rpool/data/subvol-162-disk-1@__migration__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2018-08-31 11:30:23 ERROR: command 'set -o pipefail && pvesm export local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox5-staging-02' root@192.168.10.51 -- pvesm import local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
2018-08-31 11:30:23 aborting phase 1 - cleanup resources
2018-08-31 11:30:23 ERROR: found stale volume copy 'local-zfs:subvol-162-disk-1' on node 'proxmox5-staging-02'
2018-08-31 11:30:23 start final cleanup
2018-08-31 11:30:23 ERROR: migration aborted (duration 00:00:01): command 'set -o pipefail && pvesm export local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox5-staging-02' root@192.168.10.51 -- pvesm import local-zfs:subvol-162-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
TASK ERROR: migration aborted

wolfgang · Aug 31, 2018

If you try again it should now be cleaned up and you can migrate.
If not you must remove the image on the target side manually.

grobs · Aug 31, 2018

pvesm was still showing the disk image:

Code:

root@proxmox5-staging-02:~# pvesm list local-zfs
local-zfs:subvol-162-disk-1 subvol 11811160064 162

So I deleted the stale image:

Code:

root@proxmox5-staging-02:~# pvesm free local-zfs:subvol-162-disk-1

And then migrate task succeeded.

Thank you very much for your help wolfgang, support and community is one reason I love Proxmox!

Keep going!

Search

Search

[SOLVED] LXC migration fails on ZFS

grobs

Active Member

Attachments

wolfgang

Proxmox Retired Staff

grobs

Active Member

wolfgang

Proxmox Retired Staff

grobs

Active Member

We value your privacy