VM live migration fails when Cloud-Init image is attached

uno

New Member
Jun 5, 2018
8
0
1
34
Hi guys,

I setup a 3-node experimental cluster with LVM on iSCSI (FreeNAS) as storage.

Live migration fails when the CloudInit drive is attached but it works fine without the CloudInit drive. Is this normal?

Live migration with the CloudInit drive attached gives this error. /run/pve/cloudinit/103/ doesn't exist
Code:
root@proxmox3:~# qm migrate 103 proxmox2 --online
2018-06-12 17:26:45 starting migration of VM 103 to node 'proxmox2' (10.246.40.102)
2018-06-12 17:26:45 copying disk images
2018-06-12 17:26:45 starting VM 103 on remote node 'proxmox2'
2018-06-12 17:26:46 command 'set -o pipefail && genisoimage -R -V cidata /run/pve/cloudinit/103/ | qemu-img dd -n -f raw -O raw 'isize=0' 'osize=0' 'of=/dev/security/vm-103-cloudinit'' failed: exit code 1
2018-06-12 17:26:46 ERROR: online migrate failure - command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox2' root@10.246.40.102 qm start 103 --skiplock --migratedfrom proxmox3 --migration_type secure --stateuri unix --machine pc-i440fx-2.11' failed: exit code 255
2018-06-12 17:26:46 aborting phase 2 - cleanup resources
2018-06-12 17:26:46 migrate_cancel
2018-06-12 17:26:47 ERROR: migration finished with problems (duration 00:00:03)
migration problems

Running the SSH command produced the following output. It seems like it can't open the CloudInit LVM volume.
Code:
root@proxmox3:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox2' root@10.246.40.102 qm start 103 --skiplock --migratedfrom proxmox3 --migration_type secure --stateuri unix --machine pc-i440fx-2.11
qemu-img: Could not open '/dev/security/vm-103-cloudinit': Could not open '/dev/security/vm-103-cloudinit': No such file or directory
Total translation table size: 0
Total rockridge attributes bytes: 417
Total directory bytes: 0
Path table size(bytes): 10
qemu-img: Could not open '/dev/security/vm-103-cloudinit': Could not open '/dev/security/vm-103-cloudinit': No such file or directory
command 'set -o pipefail && genisoimage -R -V cidata /run/pve/cloudinit/103/ | qemu-img dd -n -f raw -O raw 'isize=0' 'osize=0' 'of=/dev/security/vm-103-cloudinit'' failed: exit code 1

Live migration without the CloudInit drive is ok.
Code:
root@proxmox3:~# qm migrate 103 proxmox2 --online
2018-06-12 17:44:42 starting migration of VM 103 to node 'proxmox2' (10.246.40.102)
2018-06-12 17:44:47 copying disk images
2018-06-12 17:44:47 starting VM 103 on remote node 'proxmox2'
2018-06-12 17:44:50 start remote tunnel
2018-06-12 17:44:51 ssh tunnel ver 1
2018-06-12 17:44:51 starting online/live migration on unix:/run/qemu-server/103.migrate
2018-06-12 17:44:51 migrate_set_speed: 8589934592
2018-06-12 17:44:51 migrate_set_downtime: 0.1
2018-06-12 17:44:51 set migration_caps
2018-06-12 17:44:51 set cachesize: 536870912
2018-06-12 17:44:51 start migrate command to unix:/run/qemu-server/103.migrate
2018-06-12 17:44:52 migration status: active (transferred 127045208, remaining 2157322240), total 4295761920)
2018-06-12 17:44:52 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
2018-06-12 17:44:53 migration status: active (transferred 297025970, remaining 101130240), total 4295761920)
2018-06-12 17:44:53 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
2018-06-12 17:44:54 migration speed: 1365.33 MB/s - downtime 111 ms
2018-06-12 17:44:54 migration status: completed
 
I'm having a similar issue. In my case it is local storage, not shared. But the result is the same, although with less detail:

Code:
# qm migrate 223 pm2 --online --with-local-disks
2018-08-04 11:06:33 starting migration of VM 223 to node 'pm2' (192.168.1.230)
2018-08-04 11:06:33 found local disk 'local-lvm:vm-223-cloudinit' (in current VM config)
2018-08-04 11:06:33 found local disk 'local-lvm:vm-223-disk-1' (in current VM config)
2018-08-04 11:06:33 can't migrate local disk 'local-lvm:vm-223-cloudinit': local cdrom image
2018-08-04 11:06:33 ERROR: Failed to sync data - can't migrate VM - check log
2018-08-04 11:06:33 aborting phase 1 - cleanup resources
2018-08-04 11:06:33 ERROR: migration aborted (duration 00:00:01): Failed to sync data - can't migrate VM - check log
migration aborted

Offline migration fails identically.

Where is this log it speaks of?

I am assuming the issue is directly related to Uno's issue as online works fine without the CD-ROM image.

Indeed, I think the issue is that it is a "CD-ROM". Trying to migrate a VM with an OS install ISO still attached (from when it was created) has always failed, so in a way it makes sense that something similar would happen with a cloud-init CD-ROM image. I would imagine that some additional code is required to deal with this situation, as opposed to this being a code error ?
 
Further to this, I tried creating the cloud-init image on interface scsi1 rather than ide2 and it was still treated as a CD-ROM image. Obviously this is due to the way cloud-init generates images. I thought it might be a function of the interface, but I was incorrect.
 
I'm having a similar issue. In my case it is local storage, not shared. But the result is the same, although with less detail:

Code:
# qm migrate 223 pm2 --online --with-local-disks
2018-08-04 11:06:33 starting migration of VM 223 to node 'pm2' (192.168.1.230)
2018-08-04 11:06:33 found local disk 'local-lvm:vm-223-cloudinit' (in current VM config)
2018-08-04 11:06:33 found local disk 'local-lvm:vm-223-disk-1' (in current VM config)
2018-08-04 11:06:33 can't migrate local disk 'local-lvm:vm-223-cloudinit': local cdrom image
2018-08-04 11:06:33 ERROR: Failed to sync data - can't migrate VM - check log
2018-08-04 11:06:33 aborting phase 1 - cleanup resources
2018-08-04 11:06:33 ERROR: migration aborted (duration 00:00:01): Failed to sync data - can't migrate VM - check log
migration aborted

Offline migration fails identically.

Where is this log it speaks of?

I am assuming the issue is directly related to Uno's issue as online works fine without the CD-ROM image.

Indeed, I think the issue is that it is a "CD-ROM". Trying to migrate a VM with an OS install ISO still attached (from when it was created) has always failed, so in a way it makes sense that something similar would happen with a cloud-init CD-ROM image. I would imagine that some additional code is required to deal with this situation, as opposed to this being a code error ?
I'm Facing the same issue while offline migration. Any solutions yet? I'm using PVE 5.2 and local-lvm
 
Hi Saumya Kanta Swain,

Proxmox VE is a rolling Distribution there are no fixes in old versions.
You have to update your node.
 
Offline migration should work with qemu-server 5.0-50 or higher. It's not yet available in pve-no-subscription but will be soon enough.
 
Please update to the latest version.
How exactly do you offline migrate? CLI or GUI?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!