Hello,
I would like to like does Proxmox support Storage Migration including live Migration? Such as
- Ceph <--> Ceph
- Ceph <--> local LVM
- local LVM <---> Local LVM
Thanks!
Hi,I haven't checked all migration types you asked for, but in general it works flawlessly and online, but only for KVM-based VMs.
I did an online SAN switch from an old to a new system without any downtime.
Hi,vm live migration + storage migration at the same time is available but command line only (and only with local storage as source)
qm migrate <vmid> <target> --with-local-disks --targetstorage yourtargetstorageonremotehost
Hi,
but sparse files aren't anymore sparse after migration....
Udo
Hi,
I assume also with an different target storage, the online migration will work reliable with on disk only?
Or is the "multible disk migration"-bug resolved?
Udo
If you run fstrim there is no need for discard. fstrim does not rely on nor does it use discard. discard is only used in combination with a filesystem delete.with livemigration + storage migration, yes. (because of nbd protocol, maybe it will be fixed in qemu 3.0).
for classic storage migration, it's depend of the storage. (ceph for example, isn't sparse after migration).
As workaround, in last proxmox update, they are new guest agent feature (agent: 1,fstrim_cloned_disks=1) , to do an fstrim through qemu agent after the migration. (if you have virtio-scsi + discard).
If you run fstrim there is no need for discard. fstrim does not rely on nor does it use discard. discard is only used in combination with a filesystem delete.
Hi Spirit,It's working with multiple disks, but without iothreads. (I have tested it on more than 100vm migration).
I need to test it again with qemu 3.0 for iothreads. (should be available soon in proxmox)
root@pve01:~# cat /etc/pve/qemu-server/210.conf
boot: cd
bootdisk: scsi0
cores: 2
cpu: kvm64,flags=+pcid
hotplug: 1
lock: migrate
memory: 2048
name: vdb02
net0: virtio=6A:C6:68:EA:F8:8F,bridge=vmbr0,tag=2
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-210-disk-1,format=raw,size=25G
scsi1: local-zfs:vm-210-disk-2,format=raw,size=41G
scsihw: virtio-scsi-pci
serial0: socket
sockets: 1
root@pve01:~# qm migrate 210 pve02 --online --with-local-disks
2018-09-04 11:37:01 starting migration of VM 210 to node 'pve02' (10.x.x.12)
2018-09-04 11:37:01 found local disk 'local-zfs:vm-210-disk-1' (in current VM config)
2018-09-04 11:37:01 found local disk 'local-zfs:vm-210-disk-2' (in current VM config)
2018-09-04 11:37:01 copying disk images
2018-09-04 11:37:01 starting VM 210 on remote node 'pve02'
2018-09-04 11:37:05 start remote tunnel
2018-09-04 11:37:06 ssh tunnel ver 1
2018-09-04 11:37:06 starting storage migration
2018-09-04 11:37:06 scsi1: start migration to nbd:10.x.x.12:60001:exportname=drive-scsi1
drive mirror is starting for drive-scsi1
drive-scsi1: transferred: 0 bytes remaining: 44023414784 bytes total: 44023414784 bytes progression: 0.00 % busy: 1 ready: 0
drive-scsi1: transferred: 111149056 bytes remaining: 43912265728 bytes total: 44023414784 bytes progression: 0.25 % busy: 1 ready: 0
drive-scsi1: transferred: 228589568 bytes remaining: 43794825216 bytes total: 44023414784 bytes progression: 0.52 % busy: 1 ready: 0
drive-scsi1: transferred: 348127232 bytes remaining: 43675287552 bytes total: 44023414784 bytes progression: 0.79 % busy: 1 ready: 0
...
drive-scsi1: transferred: 43708841984 bytes remaining: 314572800 bytes total: 44023414784 bytes progression: 99.29 % busy: 1 ready: 0
drive-scsi1: transferred: 43825233920 bytes remaining: 198180864 bytes total: 44023414784 bytes progression: 99.55 % busy: 1 ready: 0
drive-scsi1: transferred: 43942674432 bytes remaining: 80740352 bytes total: 44023414784 bytes progression: 99.82 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
2018-09-04 11:45:35 scsi0: start migration to nbd:10.x.x.12:60001:exportname=drive-scsi0
drive mirror is starting for drive-scsi0
drive-scsi0: transferred: 0 bytes remaining: 26843545600 bytes total: 26843545600 bytes progression: 0.00 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi0: transferred: 61865984 bytes remaining: 26781679616 bytes total: 26843545600 bytes progression: 0.23 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi0: transferred: 166723584 bytes remaining: 26676822016 bytes total: 26843545600 bytes progression: 0.62 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
...
drive-scsi0: transferred: 26734493696 bytes remaining: 111017984 bytes total: 26845511680 bytes progression: 99.59 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi0: transferred: 26845380608 bytes remaining: 131072 bytes total: 26845511680 bytes progression: 100.00 % busy: 1 ready: 0
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi0: transferred: 26845642752 bytes remaining: 0 bytes total: 26845642752 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
2018-09-04 11:50:33 starting online/live migration on tcp:10.x.x.12:60000
2018-09-04 11:50:33 migrate_set_speed: 8589934592
2018-09-04 11:50:33 migrate_set_downtime: 0.1
2018-09-04 11:50:33 set migration_caps
2018-09-04 11:50:33 set cachesize: 268435456
2018-09-04 11:50:33 start migrate command to tcp:10.x.x.12:60000
2018-09-04 11:50:34 migration status: active (transferred 96142201, remaining 2062778368), total 2165121024)
2018-09-04 11:50:34 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:35 migration status: active (transferred 166236744, remaining 1989947392), total 2165121024)
2018-09-04 11:50:35 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:36 migration status: active (transferred 249148312, remaining 1900494848), total 2165121024)
2018-09-04 11:50:36 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:37 migration status: active (transferred 342631354, remaining 1802543104), total 2165121024)
2018-09-04 11:50:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:38 migration status: active (transferred 423618924, remaining 1701597184), total 2165121024)
2018-09-04 11:50:38 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:39 migration status: active (transferred 514885572, remaining 1596645376), total 2165121024)
2018-09-04 11:50:39 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:40 migration status: active (transferred 609658356, remaining 1491333120), total 2165121024)
2018-09-04 11:50:40 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
...
2018-09-04 11:50:54 migration status: active (transferred 1963964198, remaining 18128896), total 2165121024)
2018-09-04 11:50:54 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2018-09-04 11:50:54 migration speed: 2.47 MB/s - downtime 111 ms
2018-09-04 11:50:54 migration status: completed
drive-scsi0: transferred: 26845904896 bytes remaining: 0 bytes total: 26845904896 bytes progression: 100.00 % busy: 0 ready: 1
drive-scsi1: transferred: 44023414784 bytes remaining: 0 bytes total: 44023414784 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi1: Completing block job...
drive-scsi1: Completed successfully.
drive-scsi0: Cancelling block job
drive-scsi1: Cancelling block job
drive-scsi0: Cancelling block job
drive-scsi1: Cancelling block job
2018-09-04 11:51:03 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.x.x.12 pvesm free local-zfs:vm-210-disk-2' failed: exit code 1
2018-09-04 11:51:10 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.x.x.12 pvesm free local-zfs:vm-210-disk-1' failed: exit code 1
2018-09-04 11:51:10 ERROR: Failed to completed storage migration
2018-09-04 11:51:10 ERROR: migration finished with problems (duration 00:14:10)
migration problems
root@pve01:~#
pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-2-pve)
pve-manager: 5.2-7 (running version: 5.2-7/8d88e66a)
pve-kernel-4.15: 5.2-5
pve-kernel-4.15.18-2-pve: 4.15.18-20
pve-kernel-4.15.17-1-pve: 4.15.17-9
ceph: 12.2.7-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-38
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-10
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-1
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-25
pve-docs: 5.2-8
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-32
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
Hi Spirit,@udo
That's strange, the code is in
/usr/share/perl5/PVE/QemuMigrate.pm,phase3_cleanup()
eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}); };
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi1: Completing block job...
drive-scsi1: Completed successfully.
then, if error
if (my $err = $@) {
eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $self->{storage_migration_jobs}) };
drive-scsi0: Cancelling block job
drive-scsi1: Cancelling block job
drive-scsi0: Cancelling block job
drive-scsi1: Cancelling block job
eval { PVE::QemuMigrate::cleanup_remotedisks($self) };
2018-09-04 11:51:03 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.x.x.12 pvesm free local-zfs:vm-210-disk-2' failed: exit code 1
2018-09-04 11:51:10 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.x.x.12 pvesm free local-zfs:vm-210-disk-1' failed: exit code 1
I don't understand why it's going in error, if the block job has been succefully completed.
maybe can you try to display error
if (my $err = $@) {
$self->log('err', "$err");
eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $self->{storage_migration_jobs}) };
root@pve01:~# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
rpool/data/vm-210-disk-1@rep_vdb02_2017-11-28_16:15:01 3.62G - 6.89G -
rpool/data/vm-210-disk-2@rep_vdb02_2017-11-28_16:15:01 12.5G - 33.2G -
rpool/data/vm-210-disk-2@__migration__ 0B - 33.2G -
I was referring to the proxmox checkbox 'discard' to prevent confusion not whether the filesystem supports discard or not. Many users have 'discard' checked while running fstrim on a regular basis which is foolish.fstrim use FIMTRIM ioctl, and you need discard support on the device. (I'm not talking about discard option in /etc/fstab).
I was referring to the proxmox checkbox 'discard' to prevent confusion not whether the filesystem supports discard or not. Many users have 'discard' checked while running fstrim on a regular basis which is foolish.
Hi Spirit,@udo
interesting.
Thees snapshots seem to be related to zfs replication feature. (do you have enable it on the vm ?)
They should be used only in case of offline vm migrate (without --local-disks).
I'm really not sure that qemu live migration + qemu storage migration is compatible with zfs replication features.