Moving VMs with snapshots

andreisrr

Member
Feb 2, 2024
46
4
13
I have a small PVE cluster with 2 nodes, both have NFS storage, one has a not too large LVM storage, the other also has additional local directory storage.

I have a VM on node1, LVM storage, it also has a manual snapshot.

If I want to move the disk to NFS storage (because the other node doesn't have enough LVM storage to accomodate this VM and I don't get other options in the "migrate" related to storage) while deleting source, I get the error "you can't move a disk with snapshots and delete the source (500)".
Since this tells me nothing about the snapshot (if I weren't to delete the source), what will actually happen?
- move disk to new storage, forget snaphot?
- move disk to new storage, consolidate snapshot first?
- move disk to new storage, still have partial link to old storage for the snapshot
- other?

Since that VM is production and the admin requires the existing snapshot, what will actually happen if I move it?

Thank you.
 
Hi,

moving the disk without deleting the source will result in two disks. The copy of the original located on the NFS storage and the original one, remaining on local-lvm, the new disk will be attached to the VM and the old one marked as unused (the qm cli documentation mentions this behavior [0]). Rolling back the snapshot will attach the snapshot of the disk on LVM to the VM and mark the copy on NFS as unused.

This will still prevent you from migrating the VM to another node if that is what you wish to do, your best bet in this case is to clone the VM to the NFS storage and migrate it to the other node.

[0]: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#cli_qm_disk_move
 
No, the snapshot(s) will stay on the original VM. After switching you can keep the original around (and create snapshots of the new VM) and remove the old one when you data retention policy allows it.
 
Last edited:
So in conclusion, there is no way to move a VM's disk to a different storage together with the snapshots, right?

- copy disk to new storage, old disk & snapshot remain on old storage, if snapshot is used other then discard, Vm reverts to old disk on old storage, new disk remains unsued
- clone VM, old keeps snapshot, new doesn't have it

Hopefully this thread will be read by Proxmox developers, I suggest in the future implementing a way to take the snapshot with you when you move disk to different storage and that storage suports snapshots... Thank you.
 
There is the issue of how the VM is stored. If it is on, say, lvm-thin, then the storage can handle the snapshot and the VM is in raw format. If it is on a file storage like NFS, then the image itself must do it, for example by being in qcow format rather than raw, because NFS itself doesn't do snapshots.
 
> Since that VM is production and the admin requires the existing snapshot, what will actually happen if I move it?

Proxmox is a little weird about snapshots and migrating storage. Best advice is to not rely on snapshots for more than about a week after updating, and make a backup of the VM in the reverted-to-snapshot state in case it's needed later. Reverting an actively-updated vm after a week can be problematic anyhow, you stand a good chance of losing updated data.

I would recommend taking a Sunday (or other convenient downtime window) and do the following:

o Backup the vm in its current state at the host level to tar, using proxmox builtin backup
o If Windows guest, also backup in-vm to Samba share with e.g. Veeam free agent

o Clone the vm from the old snapshot state (this will create new copy of the "reverted" vm state without the snapshot) and can stay offline or on host-only network in case it's needed for something. 99% this version of the vm will not be needed, but you'll have it for the admin.

o Consolidate / delete snapshots on the prod vm and move it to wherever needed.

Alternatively, you could clone the vm from the Current state and create a new copy of the VM w/o snapshot for prod / moving, this would basically leave the original VM as-is with the just-in-case snapshot - but it would need to be powered off to avoid conflicts.
 
Last edited:
  • Like
Reactions: Johannes S
I use for some testing those nested snapshots(testing different versions of some app), and i haven't found a reliable way to migrate or backup those snapshots.
 
It's gonna take more disk space / resources, but long-term you're better off cloning the different snapshot levels to separate VMs for the different app versions - rather than relying on a somewhat-fragile** snapshot stack (which also slows down disk I/O)

**I have seen a corrupted snapshot stack on esxi cause the VM to have to be restored from backup (or worst-case, rebuilt) - without the snapshots.


Depending on how many server instances you have, could spread the disk usage out a bit between several servers and/or take advantage of ZFS for compression and possibly fast-dedup.
 
Best advice is to not rely on snapshots for more than about a week after updating

The official VMware best practices are even more strict:

"Do not use a single snapshot for more than 72 hours.

The snapshot file continues to grow in size when it is retained for a longer period. This can cause the snapshot storage location to run out of space and impact the system performance."

And more...

Source: https://knowledge.broadcom.com/external/article/318825/best-practices-for-using-vmware-snapshot.html
 
Interesting. Besides the performance and storage impact, they don't give other reasons.
The disk resize operations are forbidden when there are snapshots, but it is not the case here.
 
Moving data (block or file) that has snapshots associated with it is always challenging, regardless of the OS, hypervisor, or application. Moving such data between different storage types is exponentially more complex, as snapshot formats are proprietary to each storage implementation. For example, Dell/EMC NAS snapshots are not compatible with NetApp snapshots. LVM snapshots are not compatible with QCOW snapshots, and so on.


Moving data across heterogeneous storage typically requires a full read/write cycle via an external or client-side application. Such applications rarely, if ever, have visibility into the internal snapshot structure. Even if an application understood the snapshot format on the source, it would not be able to recreate that snapshot structure on the target, because snapshots are storage-native constructs. They are read-only by definition and represent a point-in-time state of the data.

A highly theoretical workflow might look like this:
  1. Identify the oldest source snapshot.
  2. Perform a full data transfer to the target.
  3. Create a native snapshot on the target.
  4. Identify the next source snapshot and calculate block-level differences.
  5. Transfer only the changed blocks.
  6. Create another native snapshot on the target.
As you can imagine, implementing such cross-vendor snapshot translation would be extremely complex and expensive.

In the enterprise world, when migrating data between vendors, the common approach is to migrate the active dataset while retaining the original storage (and its snapshots) until those snapshots expire naturally according to retention policies.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
What is exactly the problem with PVE and snapshots after about a week ?

As others explained the problem isn't with PVE but is a generic problem with storage level snapshots. They always come with a storage and (depending on the used storage type) performance penalty. Copy-on-write based storages (like ZFS or btrfs) allow "cheap" snapshots meaning that they won't cause a performance impact but they still will cost storage place. LVM thin and LVM thick snapshots will hurt performance and used storage. I don't know the specifics for vmware but since it's considered best practice to only keep snapshots for 72 hours (as explained by onslow) I would suspect similiar issues like with LVM.
 
Understood.
I'll keep in mind the 72h recommendation.
I'll talk to the other admins about the recommended procedures regarding this situation.

Thank you very much for this discussion and recommendations.