remote-migration fails between different Storage Backends

mpit · Tuesday at 14:27

After being advised to use the remote-migration feature to keep QEMU and LXC templates in sync between our three clusters, I have implemented additional remote-migration tasks in our Ansible playbook which is already capable of creating and deploying the OS templates within our primary cluster.

Our two main clusters use identical Ceph Storage, remote-migration works well here.

Mild issue: after the remote-migration the source VM stays locked with state "migration" as I am supplying "--delete 0" parameter because the template should be copied to the other clusters without being removed.

Big issue: When targeting our third cluster, which does not have Ceph Storage but local ZFS storage in each node, I encounter the following error.

ERROR: error - tunnel command '{"export_formats":"raw+size","migration_snapshot":0,"cmd":"disk-import","allow_rename":"1","format":"raw","storage":"zfs-hdd","volname":"base-200-disk-0","with_snapshots":0}' failed - failed to handle 'disk-import' command - no matching import/export format found for storage 'zfs-hdd'

After some research I found this post from 2023:

[SOLVED] Post in thread 'Migration between two clusters - no matching format found'

Jan 9, 2023

Hi,
is the source VM's storage also ZFS? Unfortunately, offline disk migration to/from ZFS is only implemented from/to ZFS at the moment. As a workaround, you can try to migrate the VM while it is running (assuming the disk is attached to the VM).

> Unfortunately, offline disk migration to/from ZFS is only implemented from/to ZFS at the moment.

Is this still the case, am I basically stranded here with a half-working concept?

GorgonzolaPrimavera · Tuesday at 17:32

One of the things I hit upon today was using `qm disk move` (or pct move-volume in your case) to get around the zfs<-->ceph hatred. Both VM and CT disk migration followed by CT/VM migration worked fine for me from zfs to ceph just now (Jun 16 2026) once I did the disk move and deleted the unusedX from the CT/VM before migration.

bbgeek17 · Tuesday at 19:32

mpit said:
Mild issue: after the remote-migration the source VM stays locked with state "migration" as I am supplying "--delete 0" parameter because the template should be copied to the other clusters without being removed.

I believe this is expected to avoid duplicate VMs on the network. You can probably add a step to your Ansible to unlock the VM.

mpit said:
Big issue: When targeting our third cluster, which does not have Ceph Storage but local ZFS storage in each node, I encounter the following error.
ERROR: error - tunnel command '{"export_formats":"raw+size","migration_snapshot":0,"cmd":"disk-import","allow_rename":"1","format":"raw","storage":"zfs-hdd","volname":"base-200-disk-0","with_snapshots":0}' failed - failed to handle 'disk-import' command - no matching import/export format found for storage 'zfs-hdd'

The "issue" is that Ceph is using "raw+size" for migration of offline VMs (which your template is), however the ZFS only supports "zfs" format today.

Things are different for Live migration as the QEMU shim is used instead.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Tuesday at 21:07

bbgeek17 said:
The "issue" is that Ceph is using "raw+size" for migration of offline VMs (which your template is), however the ZFS only supports "zfs" format today.

Is it a matter of validation alone? it seems to me that the allocated size is available from vmid.conf, and on disk size is available using zfs get or blockdev... its trivial enough to pipe it manually using dd | ssh | rbd-import so I wonder why this would be a showstopper.

--edit reverse is just as simple...

bbgeek17 · Tuesday at 21:12

alexskysilk said:
Is it a matter of validation alone? it seems to me that the allocated size is available from vmid.conf, and on disk size is available using zfs get or blockdev... its trivial enough to pipe it manually using dd | ssh | rbd-import so I wonder why this would be a showstopper.

I am guessing the developers would prefer to use efficient methods (ie rbd<>rbd and zfsend<>zfssend) vs generic non-efficient ones. But I am not a PVE decision maker!

https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage/ZFSPoolPlugin.pm#L884

Code:

sub volume_export_formats {
    my ($class, $scfg, $storeid, $volname, $snapshot, $base_snapshot, $with_snapshots) = @_;

    my @formats = ('zfs');
    # TODOs:
    # push @formats, 'fies' if $volname !~ /^(?:basevol|subvol)-/;
    # push @formats, 'raw' if !$base_snapshot && !$with_snapshots;
    return @formats;
}

PS https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage/RBDPlugin.pm#L1001

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Tuesday at 21:52

I mean, sure- I like efficiency too; but multi back end support is part of the PVE featureset. technically, anything that can be rendered out as bitmap (raw) can be compressed and sent anywhere. seems like an arbitrary limitation. anyone from the devs care to comment?

bbgeek17 · Tuesday at 21:59

¯\_(ツ)_/¯
As I said, the live migration of VMs does not suffer from this. OP has a unique requirement of duplicating templates that by their nature are "off".
Remote-migration is one way to do it, but it was certainly not designed with this workflow in mind. Based on the comments - other formats are on todo list. Its probably just human-power limit.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Tuesday at 22:03

yeah I hear that.

In truth, you already have PBS that could effectively serve the purpose anyway. and scripting it is trivial if the use case mandated it. Its not really a problem from my POV, more a curiousity.

mpit · Wednesday at 08:22

Thank you for the many replies.

As I explicitly want to keep templates in sync, booting and live migrating is not an option. We have two PBS in place and I think I will resort to solving it by means of backing up & restoring the templates.

As mentioned the locked source template is easily solved by issuing an unlock command.

remote-migration fails between different Storage Backends

mpit

New Member

[SOLVED] Post in thread 'Migration between two clusters - no matching format found'

GorgonzolaPrimavera

New Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

mpit

New Member

We value your privacy