Mount Points shared : migration and replication

Figo

Member
Oct 27, 2018
4
0
6
53
Hi!

On a freshly installed cluster (Proxmox 5.2), I can't migrate between nodes if mount point is active.

I have a 2 nodes cluster (so, HA not available for me, can't wait for my company to allow me buying a 3rd server!).
So I use replication + migration to switch LXC between the 2 nodes.

Here is the conf of mount point:
Code:
mp0: /var/bigfiles,mp=/var/bigfiles,shared=1,replicate=0
In the LXC, I got this mount points:
Code:
rpool/data/subvol-100-disk-1 on / type zfs (rw,noatime,xattr,posixacl)
rpool/ROOT/pve-1 on /var/bigfiles type zfs (rw,relatime,xattr,noacl)

When I tries to migrate between 2 nodes, I got this:
Code:
2018-10-27 13:02:01 shutdown CT 100
2018-10-27 13:02:09 starting migration of CT 100 to node 'sheldon' (xx.xx.xx.xx)
2018-10-27 13:02:09 ignoring shared 'bind' mount point 'mp0' ('/var/bigfiles')
2018-10-27 13:02:09 found local volume 'local-zfs:subvol-100-disk-1' (in current VM config)
2018-10-27 13:02:09 start replication job
2018-10-27 13:02:09 end replication job with error: unable to replicate mountpoint type 'bind'
2018-10-27 13:02:09 ERROR: unable to replicate mountpoint type 'bind'
2018-10-27 13:02:09 aborting phase 1 - cleanup resources
2018-10-27 13:02:09 start final cleanup
2018-10-27 13:02:09 start container on source node
2018-10-27 13:02:10 ERROR: migration aborted (duration 00:00:09): unable to replicate mountpoint type 'bind'
TASK ERROR: migration aborted

I've read the documentation and forums threads: the "shared=1" flag was exactly what I excepted... But it's not working...

NB: Source folder exists on the 2 nodes, with same content (rsync'ed).

Any ideas?

Thanks.
 
Your mount point is a local directory. PVE has no idea what that is and how to migrate that. Such things only work if you use a configured storage.
 
I'm surprized. The log say "ignoring shared 'bind' mount point 'mp0' ", and that is exactly what I exepected.
So the mp is ignored, but still fail ?

Since, I ran some tests.
With another LXC, same config and same mount point (copy/paste the conf file), it works !

Here is the log :
Code:
2018-10-28 09:04:13 shutdown CT 500
2018-10-28 09:04:14 starting migration of CT 500 to node 'raj' (xx.xx.xx.xx)
2018-10-28 09:04:14 ignoring shared 'bind' mount point 'mp0' ('/var/bigfiles')
2018-10-28 09:04:14 found local volume 'local-zfs:subvol-500-disk-1' (in current VM config)
full send of rpool/data/subvol-500-disk-1@__migration__ estimated size is 638M
total estimated size is 638M
TIME        SENT   SNAPSHOT
09:04:15   95.8M   rpool/data/subvol-500-disk-1@__migration__
09:04:16    204M   rpool/data/subvol-500-disk-1@__migration__
09:04:17    314M   rpool/data/subvol-500-disk-1@__migration__
09:04:18    426M   rpool/data/subvol-500-disk-1@__migration__
09:04:19    535M   rpool/data/subvol-500-disk-1@__migration__
09:04:20    645M   rpool/data/subvol-500-disk-1@__migration__
2018-10-28 09:04:21 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=raj' xx.xx.xx.xx pvesr set-state 500 \''{}'\'
2018-10-28 09:04:21 start final cleanup
2018-10-28 09:04:22 start container on target node
2018-10-28 09:04:22 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=raj' xx.xx.xx.xx pct start 500
2018-10-28 09:04:23 migration finished successfully (duration 00:00:10)
TASK OK

So, I'm wondering why this is workong for one VM and not the other one...

Thanks!