When will KVM live / suspend migration on ZFS work?

please implement this. It's really nice to have this feature and could also be a sell booster for proxmox
 
"Live" migration without shared storage would be great
Xenserver, in example, support it .....

Hyper-V also supports it:

"You can also perform a live migration of a virtual machine between two non-clustered servers running Hyper-V when you are only using local storage for the virtual machine. (This is sometimes referred to as a “shared nothing” live migration. In this case, the virtual machines storage is mirrored to the destination server over the network, and then the virtual machine is migrated, while it continues to run and provide network services."
https://technet.microsoft.com/en-us/library/hh831435(v=ws.11).aspx

VMWare also supports live migration from local storage:

"VMware vSphere vMotion moves running virtual machines from one ESX/ESXi host to another without stopping the VM's operations."
http://searchservervirtualization.t...s-vMotion-guide-to-VM-live-migration-features

It's important that I am not asking for full live migration from local storage: a simple, two phase suspended migration (that would pause or even restart the VM) would be more than sufficient for KVM, as it would lessen the downtime to a fraction of the current.
 
Last edited:
  • Like
Reactions: chrone
In proxmox there is something similiar: qemu is able to move it's qcow2 image between filesystems in real time

This could also be done for migrating: just mount a temporary nfs share between 2 server and star moving the qcow2 image.
Then a simple shutdown on the older node and a poweron on the newer could solve the job.
This should not be too hard to implement

but as quemu is already able to live migrate without shared storage, let's implement this......
would be really great even for those users planning new infrastructure
they can start without shared storage but multiple servers and still able to move VMs between nodes for maintenance reasons. In example, you could upgrade proxmox with no downtime by manually evacuating the node

Yes, is really really usefull and should be implemented. Shouldn't be too hard, qemu is already able to do natively, we only need the proxmox side for automation and management
 
technically, it's not too difficult to implement if the vm has 1 disk.


with multiple disks, they are no transaction support, so if 1disk already migrated and we have a crash during the mirroring of 2nd disk,
you'll need to fix it manually (revert first disk to source server, or migrate manually 2nd disk to target server).

See my mail from 2013
http://pve.proxmox.com/pipermail/pve-devel/2013-January/005655.html

if somebody want help, here the workflow
Code:
phase1
------
target host
-----------
create new volumes if storage != share
---------------------------------------
add a new qm command ?


phase2
------
1)target host

send qm start to target: we need start the target vm with new disks locations.
-----------------------------------------------------------------------------
How to do this ?
Currently the target vm read the vm config file.
Do we need to update the vmconfig file before mirroring ?
or pass drive parameters in qm start command ?

start nbd_server
----------------
nbd_server_start ip:port

add drives to mirror to nbd
---------------------
nbd_server_add drive-virtio0
nbd_server_add drive-virtio1


3)source host

start mirroring of the drives
------------------------------
drive-mirror target = nbd:host:port:exportname=drive-virtioX

when drive-mirror is finished, (block-job-complete),
the source vm will continue to access volume on the remote host through nbd

start vm migration
------------------

end of vm migration
-------------------


phase3
------
1)target host

resume vm
----------

nbd_server_stop
---------------

2) source vm

delete source mirrored volumes
-------------------------------
 
Last edited:
Hmm, how a local storage migration can start after the node is down?
And who want to migrate "several hundred gigabyte KVM guests" over 1 Gigabit network?
Sorry i don't get it.
 
Obviously you start BEFORE the node is down, as a planned maintenance

If VM stay active during the migration you can migrate for 7 hours with no downtime.

I did this many times before with xenserver by livemigrating 200gb VMs between hosts with no shared storage.

Migrate 200gb via Gigabit connection requires about half an hour

Its usefull.
 
Hmm, how a local storage migration can start after the node is down?

Why would it start if a node is down? No one said it would be automatic, or in any way connected to HA. It would only be faster than it currently is, with much less downtime.

In Proxmox 4.x a KVM guest can only be migrated fully offline from local storage, which is unacceptable and lags behind most commercial and open source Proxmox competitors, hence the need for a bit more advanced migration code.

And who want to migrate "several hundred gigabyte KVM guests" over 1 Gigabit network?
Sorry i don't get it.

People who use local storage for high IO performance, and decide to move one of these VMs to another node for any reason what so ever.
 
  • Like
Reactions: chrone
So it's for maintenance only. For HA u need shared storage and it's usefull for maintenance also.
Gluster exist. :)
 
Every commercial hypervosor support this
Qemu already support this
The only missing thing is the support in proxmox

gluster needs the whole hardware/infrastructure and at least 3 nodes to be safe and secure

You can't force small users to bring up a whole shared storage on existing environment just to move a vm between nodes when this is ALREADY supported natively by kvm. And if you bring up the shared storage you still have to move (live) the vm from local storage to the shared and then from shared to the local storage on new node

You can't use only the shared storage as this would require a dedicated fully redundant storage network (10gb or similiar)

No, what are you saying is totally nonsense on small environment or on existing infrastructure
 
So it's for maintenance only. For HA u need shared storage and it's usefull for maintenance also.
Gluster exist. :)

This has been talked to death before in this thread, you might want to read first and comment later. Ceph and Gluster are very useful, but only for backups or very low IO guests (at least on our bonded gigabit infrastructure).

Also, this has nothing to to with the fact that block migration code is already included in QEMU/KVM, and as Alessandro said above every commercial (Hyper-V, VMWare, etc.) and even some open source (like OpenStack, OpenNebula) hypervisor supports live (or at least two phase suspended) migration from local storage.
 
I'l try to look if I can make some basic implementation for next month.
Limiting to 1 local disk by vm for now. (is it enough for for need ?)
How do you intend to make it atomic?
I will suggest this as a two-step function:
1) Migrate disk
2) Migrate server

This way makes rollback a lot easier.
 
How do you intend to make it atomic?
I will suggest this as a two-step function:
1) Migrate disk
2) Migrate server

This way makes rollback a lot easier.

yes exactly.

the process is:

start a target vm, with a new disk a on target local storage
start ndb server inside target qemu. (this will share the disk on a network port)
start mirror disk on vm disk to nbd target.
when mirror is done, source vm is running the disk on remote nbd server.
then, start live migration. (so if it's crash, we can simply move the vm config file to target node)
when done, stop source vm, delete disk, resume target vm.

It's not too much difficult, but a lot of checks need to be done (remove snasphots,....)
 
  • Like
Reactions: vkhera
As qcow2 support extension, initially supporting just one local disk should be enough (at least for me, If I have to increase the VM disk space, I'll extend the qcow2 without adding multiple disks)

But please add a flag to choose deleting the older VM or not. DO NOT delete the older VM automatically. I had a very, very bad experience with XenServer during a live migration. The live migration was exited with an error during the starting up of the new VM (i don't know why, but XS is full of bugs) and XS has deleted the old virtual machine anyway. Hopefully the new VM was transfered properly but wasn't started on the new host.

So, DO NOT delete automatically if the "delete VM after successful migration" is set to true (please set it to false as default)

Just to stay on safe.
 
  • Like
Reactions: chrone
I don't know if this is possible to implement: please ask for a transfer network, like XenServer does. On some servers there are multiple network that could be used for VM transfer. Use one of them (choosen by the user) when creating the NBD connection
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!