PVE 5.4 live migration issue

ricardoj

Member
Oct 16, 2018
101
8
23
66
Sao Paulo - Brazil
Hi,

I'm testing live migration function in my LAB and I can not succeed if VM is "live".

Migration only owrks if VM is powered off.

This is a really smal VM ( Tiny Linux ) with 2 GB disk and 512 MB RAM

There is always an error message :

======================================
019-06-23 22:15:56 starting migration of VM 200 to node 'pve-t02' (192.168.0.209)
2019-06-23 22:15:56 found local disk 'pool01:vm-200-disk-0' (in current VM config)
2019-06-23 22:15:56 can't migrate local disk 'pool01:vm-200-disk-0': can't live migrate attached local disks without with-local-disks option
2019-06-23 22:15:56 ERROR: Failed to sync data - can't migrate VM - check log
2019-06-23 22:15:56 aborting phase 1 - cleanup resources
2019-06-23 22:15:56 ERROR: migration aborted (duration 00:00:00): Failed to sync data - can't migrate VM - check log
TASK ERROR: migration aborted
======================================

Searching the Forum I found some references about this message "can't live migrate attached local disks without with-local-disks option"

https://forum.proxmox.com/threads/live-migration-not-working-proxmox-5-3.51888/

https://forum.proxmox.com/threads/live-migration-with-local-storage-gives-an-error.37762/

Is this still the same issue ?

Regards,

Ricardo Jorge
 
Hi,

Well, if the command showed above solves the live migration issue why not is included in GUI ?

Another option is to present a user with a dialog box asking if he / she wants to execute live migration with this option included.

I get confused as the live migration is presented in the GUI but does not work as expected.

Maybe remove it from GUI or warning the user that it only works in offline mode or with shared storage.

This will help other as well.

Regards,

Ricardo Jorge
 
Last edited:
Hi,

I did some more tests using command line and it works well until I replicate the VM to the other nodes.

Now I'm receiving the message :

=======================================
root@pve-t03:~# qm migrate 200 pve-t02 --online --with-local-disks
2019-06-24 14:03:45 starting migration of VM 200 to node 'pve-t02' (192.168.0.209)
2019-06-24 14:03:45 found local disk 'pool01:vm-200-disk-0' (in current VM config)
2019-06-24 14:03:45 copying disk images
2019-06-24 14:03:45 ERROR: Failed to sync data - can't live migrate VM with replicated volumes
2019-06-24 14:03:45 aborting phase 1 - cleanup resources
2019-06-24 14:03:45 ERROR: migration aborted (duration 00:00:00): Failed to sync data - can't live migrate VM with replicated volumes
migration aborted
=======================================

Maybe I'm wrong but replication is the process to ease migration isn't it ?

I must remove the replication job from node to be able to migrate VM again.

Even if I disable replication job there is no way to migrate VM to the node where a replication job exists.

By the way I'm using command line to test migration not the GUI.

Regards,

Ricardo Jorge
 
Last edited:
you need to poweroff the VM and do an offline migration, or remove the current replication data & job and do a live migration. live-migration needs to copy local disks, and currently cannot re-use the replicated older state.
 
Hi,

Fabian, thank you for your reply but my point here is, if everybody knows what to do why not put this in the GUI ?

It is not intuitive to start a command form the GUI and receive an error message telling you to go to the command line !

Beside that replication is part of migration.

It is weird to have replication that ease migration and when you try to migrate you must remove it ( disable it is not enought ) justo to complete the migration.

From my point of view this is a bug and not a procedure that must be documented so the user can follow step by step just to have a VM migrated from one node to another.

Everything works great if you have shared storage but with internal storage there are some edges to be polished.

Best regards,

Ricardo Jorge
 
Hi,

Fabian, thank you for your reply but my point here is, if everybody knows what to do why not put this in the GUI ?

It is not intuitive to start a command form the GUI and receive an error message telling you to go to the command line !

new features, especially ones that are a bit complicated/advanced, usually get added to the API/CLI first, and to the GUI later on

Beside that replication is part of migration.

It is weird to have replication that ease migration and when you try to migrate you must remove it ( disable it is not enought ) justo to complete the migration.
replication is not part of migration.

From my point of view this is a bug and not a procedure that must be documented so the user can follow step by step just to have a VM migrated from one node to another.
this is not a bug, but a limitation. it's simply not possible currently to live-migrate and re-use replicated volumes. so either you don't live-migrate (stop, migrate, start), or you don't replicate. this is not a choice PVE can make for you automatically, but something you need to decide yourself.

Everything works great if you have shared storage but with internal storage there are some edges to be polished.

naturally shared storage migration is simpler - there is nothing to migrate on the storage side ;)
 
  • Like
Reactions: MertsA
there is no such limitation except interface design considerations, i.e., keeping the GUI simple enough to remain usable, and initially keeping more advanced/experimental features to areas where only more advanced users dare to go. there are already patches on pve-devel for making live migration with local disks available on the GUI, but those won't help your use case anyway, since live migration with local, replicated disks is still not possible.
 
Hi,

@Fabian live migration with local disks is possible using command line. I tested it and it always work.

One just need to pass "with-local-disks" parameter.

Please test and tell me what you get just to make clear it is not a problem in my lab environment.

Regards,

Ricardo Jorge
 
Last edited:
you are confusing two issues:
- no local disk live migration on GUI
- no local replicated local disk live migration anywhere

one will be fixed soon, the other will take longer as it is more complicated. Please read my full answers instead of asking the same question over and over.
 
Hi,

@fabian, let's make it clear.

I'm not confusing things. What I'm doing is making clear that we have more than one point to be addressed.

Live migration with local disk is possible if you pass the correct parameter.

Live migration is possible using GUI if GUI pass the correct parameter to the function.

Live migration with replicated disk maybe possible if you delete the target or improve the logic to use already replicated disk image.

Maybe the last statement its not that simple but its possible.

Regards

Ricardo Jorge
 
Last edited:
FYI. Live migration with local disks uses qemus tools which remove all ZFS znapshots. Qemu is not aware of ZFS and just copies the data inside zvol. ZFS snapshots are used for replication (and many backup systems outside PM).

@fabian, so you are working on having live migration of replicated VMs on ZVOLs possible? I that is so, it would be the best feature ever. :)
 
FYI. Live migration with local disks uses qemus tools which remove all ZFS znapshots. Qemu is not aware of ZFS and just copies the data inside zvol. ZFS snapshots are used for replication (and many backup systems outside PM).

@fabian, so you are working on having live migration of replicated VMs on ZVOLs possible? I that is so, it would be the best feature ever. :)

we have a rough idea how it could work, but no implementation yet. maybe we can take another look once PVE 6.0 is out the door..
 
Hi,

@fabian Thanks for your reply.

IMHO There are two points here :

1) There is no sense ( as you can not use the replicated VM ) in having replication when one is working with local disks and maybe you can disable that function when you detect a VM in such a condition. Function is disabeld and no one can complain that its not working !

2) Live migration with local disk works fine if you pass the correct parameter ( and have no replication ). GUI must be fixed to pass the correct parameter and stop displayng the error message : "can't live migrate attached local disks without with-local-disks option"

May be we can have a workaround before version 6 or we can test it before version 6.

I understand that Proxmox is a complex system and not everything in the user's mind can be done / its easy to be done.

Even so, we can talk about and evaluate possibilities.

Regards,

Ricardo Jorge
 
1) There is no sense ( as you can not use the replicated VM ) in having replication when one is working with local disks and maybe you can disable that function when you detect a VM in such a condition. Function is disabeld and no one can complain that its not working !

I think you have a fundamental misunderstanding w.r.t. the purpose of replication. The main goal is to make disaster recovery easy and fast in situations where losing the last X minutes of changes (since the last replication) is less grave than having potentially hours or days of downtime to restore/rebuild. Obviously it serves a different niche than HA with shared, redundant storage. It's also cheaper to implement, which is why lots of people like it.

An added bonus is that replication enables very fast and efficient offline migration, since it is easy to just transfer the delta since the last replication and switch replication direction. Online/Live migration with replicated disks is a lot more complex on a technical level, since we cannot just re-use the replication mechanism like for the offline case - nobody stepped up to actually implement it so far.

There is no replication for non-local disks in PVE, so your suggestion does not make any sense whatsover.

2) Live migration with local disk works fine if you pass the correct parameter ( and have no replication ). GUI must be fixed to pass the correct parameter and stop displayng the error message : "can't live migrate attached local disks without with-local-disks option"

I gave you the reason for this already multiple times now, including that we are already working on revamping the migrate dialog to include the "live migration with local disks" functionality, so I won't repeat myself. There is nothing I or we must do just because you think so by the way, and such a tone is not likely to help your agenda. This thread is starting to get tiresome, so don't expect further responses from me unless there is something substantial and new to add to the discussion.
 
Hi,

@fabian, thanks again for your reply.

Just think out of a small box and you can see that there is no fundamental misunderstanding from my comments.

If so, what you are going to improve in the live migration ?

Looks like you taking this thread as personnal.

Regards,

Ricardo Jorge
 
Looks like you taking this thread as personnal.

Just stop asking again already answered topics. Please test Proxmox VE 6.0 beta (available soon) and you will if the improvements help in your use case as well.
 
Hi,

Thank you again for your reply.

I'm going to give version 6 a try and return to you.

By the way this topic is going to be answered by the new version.

Thanks for your and all others effort to make the new version a reallity.

Regards,

Ricardo Jorge
 
Hi,

@tom

First I want to say a Big Thank you for the excellent new Proxmox version 6.

Now IMHO we must think about :

A) A GUI must not guide a user to an error

B) As a Developer one must know what it's possible or not

C) Currently we know that it's not possible to live migrate a VM with local disk and replicated content

D) So, if a User click on migrate a VM that is up and running in a local disk with replicated content it will be great to have a message like this :

"Live migration with local disk and replicated content it's not possible. If you proceed the already replicated content will be deleted and a new cycle of replication will take place. This will take longer and may impact your network performance. See help for details. Do you want to proceed? Yes or No"

E) This will address all the points I tried to talk about before.

Thank you for your time and attention.

Regards,

Ricardo Jorge
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!