PVE Online Migration Fails After Update to PVE 5.4-3

HE_Cole

Member
Oct 25, 2018
45
1
6
33
Miami, FL
Hello Everyone!

My PVE cluster has been running great for many months but i just got around to updating the all the pve nodes to the latest version of PVE 5.4-3 now.

I started live migrating all my VM's to another node and that worked great.

I then set node-out on the the first node and i live migrated all the VM's from it and then i set each OSD to out on that node and waited for backfill.

Once backfill was complete i updated the all the nodes and rebooted the node that was empty.

The node came back online fine and quorate is perfect and the cluster is healthy.

BUT now when i try to live migrate the VM's back the the updated node i receive a error and live migration fails.

Here is the error

Code:
2019-04-12 16:21:20 starting migration of VM 104 to node 'he-s08-r01-pve02' (23.136.0-hidden)
2019-04-12 16:21:21 copying disk images
2019-04-12 16:21:21 starting VM 104 on remote node 'he-s08-r01-pve02'
2019-04-12 16:21:22 error with cfs lock 'storage-VM-STOR2-PVE02': rbd create vm-104-cloudinit' error: rbd: create error: (17) File exists
2019-04-12 16:21:22 ERROR: online migrate failure - command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=he-s08-r01-pve02' root@23.136.0.11 qm start 104 --skiplock --migratedfrom he-s07-r01-pve02 --migration_type secure --stateuri unix --machine pc-i440fx-2.12' failed: exit code 255
2019-04-12 16:21:22 aborting phase 2 - cleanup resources
2019-04-12 16:21:22 migrate_cancel
2019-04-12 16:21:23 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

I have tried live migration on other VM's on the other nodes and everyone fails with the same error.

Any ideas on how to correct this?


I am running the latest PVE version on ALL nodes. 2 of the 3 have NOT been rebooted since update.

Code:
# pveversion --verbose proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) pve-kernel-4.15: 5.3-3 pve-kernel-4.15.18-12-pve: 4.15.18-35 pve-kernel-4.15.18-10-pve: 4.15.18-32 pve-kernel-4.15.18-9-pve: 4.15.18-30 pve-kernel-4.15.17-1-pve: 4.15.17-9 ceph: 12.2.11-pve1 corosync: 2.4.4-pve1 criu: 2.11.1-1~bpo90 glusterfs-client: 3.8.8-1 ksm-control-daemon: 1.2-2 libjs-extjs: 6.0.1-2
 
Last edited:
Hi new information on this error.

I found that if i remove the cloud-init drive from the VM in question, I can then live migrate it to any node,.

BUT if i re add the cloud-init drive to the VM after migration the VM will NOT start and gives the error

Code:
rbd: create error: (17) File exists2019-04-12 17:36:31.736199 7f696d44f0c0 -1 librbd: rbd image vm-104-cloudinit already exists
TASK ERROR: error with cfs lock 'storage-VM-STOR2-PVE02': rbd create vm-104-cloudinit' error: rbd: create error: (17) File exists2019-04-12 17:36:31.736199 7f696d44f0c0 -1 librbd: rbd image vm-104-cloudinit already exists

This error occurs on any VM and Any node in the cluster when you delete the cloud-init drive then migrate the VM then re ADD the cloud init drive it WONT start and gives the same error above.

Seams to be related to the cloud-init drive.

As a note both the VM's drive and the cloud init drive are stored on ceph shared storage.

And this error has never happened before i have done plenty of live migrations but after the update to the new PVE 5.4-3 i can no longer migrate my VM and they error out..

I hope this new info helps.
 
Live-Migration under Proxmox really is a hit & miss. Never run any livemigration after an update before testing it in a lab envoriment. Stuff like this happens way to often since PVE5. Dont get me wrong, i live proxmox and i love the work theyre doing but livemigration did became kinda meh since v5 and it seems like having a subscribtion doesnt help either.

Hope this will be fixed soon!

Regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!