Quick migration using ZFS

docent

Renowned Member
Jul 23, 2009
96
1
73
Hi,

I tried to implement a Quick Migration using ZFS snapshot/send/receive. It works, but not as fast as I would like. Migration of the server with 1 GB of RAM takes about 1 minute. Almost all of the time is spent on saving, copying and restoring the state of VM.
Here is an example of commands and spent time on their implementation.
Code:
2015-03-18 10:01:10  0  qm suspend 100
2015-03-18 10:01:10  8  qm snapshot 100 s5 -vmstate
2015-03-18 10:01:18  1  zfs send -i rpool/vm-100-disk-1@s4 rpool/vm-100-disk-1@s5 | ssh vmc1-1 "zfs recv rpool/vm-100-disk-1"
2015-03-18 10:01:19  0  zfs snapshot rpool/vm-100-state-s5@s
2015-03-18 10:01:19 31  zfs send rpool/vm-100-state-s5@s | ssh vmc1-1 "zfs recv rpool/vm-100-state-s5"
2015-03-18 10:01:50  0  zfs destroy rpool/vm-100-state-s5@s
2015-03-18 10:01:50  0  ssh vmc1-1 "zfs destroy rpool/vm-100-state-s5@s"
2015-03-18 10:01:50  1  qm stop 100
2015-03-18 10:01:51  0  mv /etc/pve/nodes/vmc1-2/qemu-server/100.conf /etc/pve/nodes/vmc1-1/qemu-server/
2015-03-18 10:01:51 12  ssh vmc1-1 "qm rollback 100 s5"
2015-03-18 10:02:03 = 53 sec
Code:
# zfs list -t all
NAME                     USED  AVAIL  REFER  MOUNTPOINT
rpool                   54.4G   219G   144K  /rpool
rpool/ROOT              6.76G   219G   144K  /rpool/ROOT
rpool/ROOT/pve-1        6.76G   219G  6.76G  /
rpool/swap              37.1G   256G    76K  -
rpool/vm-100-disk-1     4.39G   219G  3.82G  -
rpool/vm-100-disk-1@s1   295M      -  3.49G  -
rpool/vm-100-disk-1@s2   235M      -  3.68G  -
rpool/vm-100-disk-1@s3  1.26M      -  3.82G  -
rpool/vm-100-disk-1@s4   708K      -  3.82G  -
rpool/vm-100-disk-1@s5      0      -  3.82G  -
rpool/vm-100-state-s1   1001M   219G  1001M  -
rpool/vm-100-state-s2   1.22G   219G  1.22G  -
rpool/vm-100-state-s3   1.25G   219G  1.25G  -
rpool/vm-100-state-s4   1.25G   219G  1.25G  -
rpool/vm-100-state-s5   1.25G   219G  1.25G  -
Question: Is it possible to implement a similar replication in Live Migration process in the following scenario?
1. Create a running VM disk snapshot and replicate it to another cluster node
2. Run the migration process: the memory of a running VM is copied to another cluster node
3. Switch the VM into "Suspended" state
4. Create a running VM disk snapshot again and copy it to another cluster node
5. Move the configuration of the VM to another cluster node
6. Resume the VM on the new cluster node
7. Remove unnecessary snapshots on the both nodes
 
Hi,
theoretically yes.
 
Did you eventually manage to speed up the process? I am keen to try something similar with zfs.