I found a bug in the latest 5.0-32 of PVE. I had created VM's local on disk (SSD) drive. I then migrated the disks to CEPH data store and the configuration see's the change, however when I try to do LIVE migration it still thinks there is a reference to the old SSD drive.
The thing I noticed is on the VM that it fails the original disk was left on local disk, however until I removed the local disk image even though the configuration in the cluster see's it pointing to ceph-rbd it wouldn't migrate even if i powered off the device. (Deleting the file from data_ssd:100/*.qcow2, resolved the issue) seems like a bug.
()
2017-09-23 17:48:10 starting migration of VM 100 to node 'pve02' (10.241.147.32)
2017-09-23 17:48:10 found local disk 'data_ssd:100/vm-100-disk-2.qcow2' (via storage)
2017-09-23 17:48:10 copying disk images
cannot import format raw+size into a file of format qcow2
send/receive failed, cleaning up snapshot(s)..
2017-09-23 17:48:10 ERROR: Failed to sync data - command 'set -o pipefail && pvesm export data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.241.147.32 -- pvesm import data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0' failed: exit code 255
2017-09-23 17:48:10 aborting phase 1 - cleanup resources
2017-09-23 17:48:11 ERROR: found stale volume copy 'data_ssd:100/vm-100-disk-2.qcow2' on node 'pve02'
2017-09-23 17:48:11 ERROR: migration aborted (duration 00:00:02): Failed to sync data - command 'set -o pipefail && pvesm export data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.241.147.32 -- pvesm import data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0' failed: exit code 255
TASK ERROR: migration aborted
The thing I noticed is on the VM that it fails the original disk was left on local disk, however until I removed the local disk image even though the configuration in the cluster see's it pointing to ceph-rbd it wouldn't migrate even if i powered off the device. (Deleting the file from data_ssd:100/*.qcow2, resolved the issue) seems like a bug.
()
2017-09-23 17:48:10 starting migration of VM 100 to node 'pve02' (10.241.147.32)
2017-09-23 17:48:10 found local disk 'data_ssd:100/vm-100-disk-2.qcow2' (via storage)
2017-09-23 17:48:10 copying disk images
cannot import format raw+size into a file of format qcow2
send/receive failed, cleaning up snapshot(s)..
2017-09-23 17:48:10 ERROR: Failed to sync data - command 'set -o pipefail && pvesm export data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.241.147.32 -- pvesm import data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0' failed: exit code 255
2017-09-23 17:48:10 aborting phase 1 - cleanup resources
2017-09-23 17:48:11 ERROR: found stale volume copy 'data_ssd:100/vm-100-disk-2.qcow2' on node 'pve02'
2017-09-23 17:48:11 ERROR: migration aborted (duration 00:00:02): Failed to sync data - command 'set -o pipefail && pvesm export data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.241.147.32 -- pvesm import data_ssd:100/vm-100-disk-2.qcow2 raw+size - -with-snapshots 0' failed: exit code 255
TASK ERROR: migration aborted