Hi
We have some larger virtual servers running on a ZFS mirror of nVME drives. These virtual servers run various different applications, but mostly they run Microsoft Windows Server with Microsoft SQL server on top.
In terms of ressources, they have 500-900GB of disk space on a single virtual hard drive and 10-60GB og RAM.
We are now in the process of migrating these to run on Ceph, but we are running into some issues.
When trying shut down a VM and do an offline migration, we are faced with an error about not being able to migrate from zfspool to rbd
Then we tried live/online migration, which works for small virtual servers, but for the larger ones it does not.
Here is the last few lines of an 8 vCore, 40GB RAM, 800GB virtual hard drive, system where the live migration failed just before being completed
Before migration, we did verify that there was enough space on the Ceph cluster and that the target node had enough free RAM to hold the VM
Is there a way to migrate such size of VM using offline migration when no zfspool storage is available on the target node?
We have some larger virtual servers running on a ZFS mirror of nVME drives. These virtual servers run various different applications, but mostly they run Microsoft Windows Server with Microsoft SQL server on top.
In terms of ressources, they have 500-900GB of disk space on a single virtual hard drive and 10-60GB og RAM.
We are now in the process of migrating these to run on Ceph, but we are running into some issues.
When trying shut down a VM and do an offline migration, we are faced with an error about not being able to migrate from zfspool to rbd
Code:
2021-11-14 11:04:49 starting migration of VM 314 to node 'ns7771' (172.16.0.16)
2021-11-14 11:04:49 found local disk 'local-zfs:vm-314-disk-0' (in current VM config)
2021-11-14 11:04:50 copying local disk images
2021-11-14 11:04:50 using a bandwidth limit of 104857600 bps for transferring 'local-zfs:vm-314-disk-0'
2021-11-14 11:04:50 ERROR: storage migration for 'local-zfs:vm-314-disk-0' to storage 'tier1' failed - cannot migrate from storage type 'zfspool' to 'rbd'
2021-11-14 11:04:50 aborting phase 1 - cleanup resources
2021-11-14 11:04:50 ERROR: migration aborted (duration 00:00:01): storage migration for 'local-zfs:vm-314-disk-0' to storage 'tier1' failed - cannot migrate from storage type 'zfspool' to 'rbd'
TASK ERROR: migration aborted
Then we tried live/online migration, which works for small virtual servers, but for the larger ones it does not.
Here is the last few lines of an 8 vCore, 40GB RAM, 800GB virtual hard drive, system where the live migration failed just before being completed
Code:
2021-11-14 14:47:17 migration active, transferred 42.8 GiB of 40.0 GiB VM-state, 152.4 MiB/s
2021-11-14 14:47:17 xbzrle: send updates to 21567 pages in 4.5 MiB encoded memory, cache-miss 92.97%, overflow 262
query migrate failed: VM 314 qmp command 'query-migrate' failed - client closed connection
2021-11-14 14:47:18 query migrate failed: VM 314 qmp command 'query-migrate' failed - client closed connection
query migrate failed: VM 314 not running
2021-11-14 14:47:20 query migrate failed: VM 314 not running
query migrate failed: VM 314 not running
2021-11-14 14:47:21 query migrate failed: VM 314 not running
query migrate failed: VM 314 not running
2021-11-14 14:47:22 query migrate failed: VM 314 not running
query migrate failed: VM 314 not running
2021-11-14 14:47:23 query migrate failed: VM 314 not running
query migrate failed: VM 314 not running
2021-11-14 14:47:24 query migrate failed: VM 314 not running
2021-11-14 14:47:24 ERROR: online migrate failure - too many query migrate failures - aborting
2021-11-14 14:47:24 aborting phase 2 - cleanup resources
2021-11-14 14:47:24 migrate_cancel
2021-11-14 14:47:24 migrate_cancel error: VM 314 not running
drive-scsi0: Cancelling block job
2021-11-14 14:47:24 ERROR: VM 314 not running
2021-11-14 14:52:40 ERROR: migration finished with problems (duration 03:47:17)
TASK ERROR: migration problems
Before migration, we did verify that there was enough space on the Ceph cluster and that the target node had enough free RAM to hold the VM
Is there a way to migrate such size of VM using offline migration when no zfspool storage is available on the target node?