Migrate VM and Disks without Snapshot

May 23, 2022
10
2
3
I have a VM with one quite big data disk. I want to move that VM to another Proxmox host in my cluster. The target system has enough space. Unfortunately the source system has not enough space anymore to create a snapshot. I can power down the VM and do any operation and it is not time restricted. How can I solve that problem?

Executing the migration of the whole VM will abort after the OS disk has been moved with these messages:

Code:
...
2022-05-23 09:56:29 volume 'zfs:vm-100-disk-0' is 'zfs:vm-100-disk-0' on the target
2022-05-23 09:56:31 ERROR: storage migration for 'zfs:vm-100-disk-1' to storage 'zfs' failed - zfs error: cannot create snapshot 'zfs/vm-100-disk-1@__migration__': out of space
2022-05-23 09:56:31 aborting phase 1 - cleanup resources
2022-05-23 09:56:34 ERROR: migration aborted (duration 00:05:49): storage migration for 'zfs:vm-100-disk-1' to storage 'zfs' failed - zfs error: cannot create snapshot 'zfs/vm-100-disk-1@__migration__': out of space
TASK ERROR: migration aborted


Thanks for help and please let me know if there is anything more I can add to make my request better.
 
you can make that volume thin-provisioned, but beware of other risks associated with that (you can then write too much data to it, and fill your pool up, with all sorts of undefined behaviour occuring).
 
Thanks for your help @fabian . To get that right: that would only help, if my volume would have empty space allocated, right? Unfortunately in my case the disk is filled nearly completely. So even a thin-provisioned volumes snapshot would not have enough space.

In fact I want to move the whole VM with it's disks to a second server because I need to put more drives into the first server and re-create the ZFS pool.
 
no, a snapshot of a thin-provisioned volume doesn't take any extra space (except a bit for metadata), the downside is that there is no guarantee that the free space can actually be used or that the currently written data can be overwritten with new data without running out of space. a snapshot of a regular volume requires extra space precisely because the full size of the volume needs to be still available when creating the snapshot.
 
Thanks @fabian and also @_gabriel for your help.

When I get this right, I can temporary switch thin provisioning on, do my migration, and remove the thin-provisioning afterwards again, right? I only want to ensure that this is a revertible change.

Thanks all for your input and help.

proxmox-thin-provisionned.png
 
no, that only applies to newly allocated volumes. you need to do the same change on your existing volume as well.
 
@fabian can you please help with these two questions?

1) do I need to change the whole storage settings before changing the actual disk? Or is it sufficient to only change the affected disk?

2) I guess thats a thing I need to accomplish via cli. As I didn't find any documentation (sorry): can you give me a hint how to do that?
 
Last edited:
you can zfs send on source then zfs receive on dest. over mbuffer pipe.
Am I right that the VM must be powered off for that? I would do something like this:

- power off VM
- disconnect the huge disk
- migrate the VM with the OS disk
- zfs send the huge disk*
- reconnect the huge disk
- power on VM

Anything I' missing for that approach?

* I guess that's a valid example to send the disk:
Bash:
zfs send -Rv poolname/vm-100-disk-1 | ssh xxx.xxx.x.xxx zfs recv -Fd poolname
 
Last edited:
edited because i read too fast...
use mbuffer instead ssh (ssh will be slow because compression and encryption)
search examples on Web for zfs over mbuffer
 
Last edited:
edited because i read too fast...
use mbuffer instead ssh (ssh will be slow because compression and encryption)
search examples on Web for zfs over mbuffer
please don't post wrong advice like that - SSH over a local network link is rarely the bottle neck.
 
  • Like
Reactions: _gabriel
@fabian can you please help with these two questions?

1) do I need to change the whole storage settings before changing the actual disk? Or is it sufficient to only change the affected disk?
both. you need to (at least temporarily) change the target storage settings so that the newly allocated disk on the target node works (it has to receive the snapshot, so the same rules apply there as well). you need to update the disk (zvol), because otherwise creating the snapshot won't work - the storage.cfg change does not do anything for existing volumes.
2) I guess thats a thing I need to accomplish via cli. As I didn't find any documentation (sorry): can you give me a hint how to do that?
you need to set the 'refreservation' to 0 (that tells ZFS that it shouldn't reserve the space needed to actually fill the zvol with data):
zfs set refreservation=0 POOL/DATASET/vm-XXX-disk-YYY

(the dataset path needs to be adjusted obviously)
 
Okay, I did the step now and started with the approach described by @fabian .

It seems setting the reservation to 0 is not required. Before doing anything at all, I checked the current setting and it was set to "NONE":

Bash:
zfs get reservation zfs/vm-100-disk-1

NAME               PROPERTY     VALUE   SOURCE
zfs/vm-100-disk-1  reservation  none    default

So I changed the settings of the storage pool like shown in the picture a few posts above and now it is copying. Will let you know the results including duration when it is done.

Additional info: it is local LAN and speed is 0.1 to 0.2 GB per second. Total duration should be around 43 hours then for 15'761 GB to transfer... that will be some waiting time ...
 
Last edited:
reservation != refreservation ;) anyhow, hope you managed to get it working now!
 
reservation != refreservation ;) anyhow, hope you managed to get it working now!
Fuuuu ... this stuff happens when you are searching stuff and don't realize that the search engine "corrected" your spelling ...

Anyway: the refreservation has the same result:

Bash:
zfs get refreservation zfs/vm-100-disk-1
NAME               PROPERTY        VALUE      SOURCE
zfs/vm-100-disk-1  refreservation  none       local

Once again thanks for all your help. I'll post a feedback and clear summary after this is done in my initial post so that people after me find a clear statement without reading through the whole conversation.
 
Last edited:
  • Like
Reactions: _gabriel
And finally all went well. It took more than 2 days but it worked like a charm.

After rebooting the VM I first had a total proxmox outage on the backup node. It turned out that as soon as I passed through the GPU on that other node the whole node went down. Until now I don't know why that happened. I'll look into that later. It was very helpful to read this advise: https://forum.proxmox.com/threads/proxmox-stuck-on-boot-after-passthrough.86261/post-473939. In short: it says to disable CPU virtualization when you realize that a VM goes mad and ruins your server and you where so smart to turn on VM start at boot.

Once again to all involved persons: thanks so much for your help. @_gabriel , @fabian and @leesteken !
 
Just been looking at this because of the pesky snapshots as well, could not help but notice...

edited because i read too fast...
use mbuffer instead ssh (ssh will be slow because compression and encryption)
search examples on Web for zfs over mbuffer

please don't post wrong advice like that - SSH over a local network link is rarely the bottle neck.

@fabian I do not think @_gabriel was entirely wrong, except for the reasoning. SSH encryption will not throttle it down, but the bursty streams are terrible especially on a fast local connection, that's why mbuffer makes a huge difference, of course one can still use SSH with it, like zfs send | mbuffer | ssh.

Thanks for the other tips discussed here though!
 
  • Like
Reactions: _gabriel
yes, mbuffer can help depending on how your stream and the network look like. removing ssh almost always only removes a layer of protection and does not help, unless all of your network, your source and target disks are very fast.
 
SSH over a local network link is rarely the bottle neck
Really? I ran into it every time on a network with at least 10 GBE. SSH encryption is single-threaded, so it'll be bound by your CPU. The faster the single-core performance of you CPU, the highter the throughput. This limit is nowadays in the 300-500 MB/s, but depending on the used storage, it'll be the bottle neck for most high end systems. You can run muliple ssh streams which will combine the throughput until you run into another bottleneck. On a private network, you can also just run socat or nc to get much highter transfer rates with send/receive. Maybe wireguard is able to get better throughtput if you run over a non-private network, yet I haven't tried that.
 
it's pretty much 500MB/s on a reasonably modern system, yes. like I said, unless your disks on both ends and the network in between are very fast, it won't be the bottle neck (500MB/s is already half of what you effectively get out of a 10Gbit line!). note that zfs send/recv itself is also single-threaded, so the stream it generates will also not have your theoretical max storage speed either. but sure, if you are on a private network, with all flash or massively parallel storage, and with 10Gbit or higher, you can benefit from using a more transparent transport layer.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!