Revert migration failed after hosts rebooted

Manny Vazquez

Well-Known Member
Jul 12, 2017
107
2
58
Miami, FL USA
I have a cluster with 5.3, all ZFS. For some reason (no idea) one of the nodes rebooted, how can I find out why?)

The VMS that were on that node migrated to the assigned rotation on the HA, a few hours later (when I do my daily check up) I see some VMS errors on the tasks list
upload_2019-3-7_8-48-33.png
Upon investigation the error reads.
upload_2019-3-7_8-49-1.png

Can someone tell what is not configured properly to allow the auto-revert?
The replication is happening perfectly as expected, actually after the migration, it auto adjusted to replicate to the one where it came from and did the sync properly.

upload_2019-3-7_8-50-22.png
The VMs do not have any CD attached that could be missing on the other host
upload_2019-3-7_8-51-14.png
I changed now the HA so it would stop tryng to migrate but I do not see anything else to configure there
upload_2019-3-7_8-52-12.png

What am I missing to allow this to happen?

Now I have 1 host that is completely empty and would like to load balance.
 
by the way, I did try to do qm migrate 3382 pve3 --with-local-disks —online and still got the same exact errors

so, I emailed the users on that VM for a 15 minute window to test .. I shut the vm down, and did the qm migrate 302 vm3 --with-local-disks —online and it worked, it migrated in just a few seconds.. of course it takes time to shutdown and bring up .. so that is not the ideal solution, since one expects to migrate live.. or am I wrong?
 
you do have ha and replication enabled? then this is expected behavior (for now), see the notes about ha and replication in the documentation https://pve.proxmox.com/wiki/Storage_Replication

I see the notes, (here for the benefit of other readers)
upload_2019-3-7_10-3-53.png

How do I fix it? or is shutting down the only way to move the vm back to original host?

My VMs do not have data loss, since the data is NOT on the VMs. I could replicate once a week and still be valid, the only reason I do it more frequently is for the Users profiles (these are just RDP servers and I really do not like roaming profiles, they take extra time to launch desktop)
 
or is shutting down the only way to move the vm back to original host?
since live migration with replication does not work and is not supported (there are some parts missing in various parts of the stack)
moving them online back is not possible, some workarounds that should work:

* move them back offline
* stop replication for that vm, live migrate ('--with-local-disks'), and reenable replication, should work, but you copy the data twice over the network
* ha relocate, does the same as if the node it runs on would be offline, so hard poweroff and restart but may be faster than shutting down and doing an offline migrate
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!