zfs non-linear snapshots workaround ("can't roll back...is not most recent snapshot")

ronejamesdo · Apr 30, 2024

zfs does not allow non-linear snapshot restoration, which is a bummer. qcow2 is slower than zfs and harder on SSDs, which is a bummer.

I am considering working around this by converting anything I feel likely to need non-linear snapshots to qcow2, doing the work, then backing up the box and immediately stopping it and restoring it but selecting my zfs system for the "storage option".

Is there a better idea? Is the performance gained by using zfs volumes for the disks really worth not just using qcow2 all the time?

Usually I would want this under fairly predictable conditions, if I were trying something totally new, if something seems hopelessly broken, or if I'd already used my one zfs snapshot and realize that's not going to be enough (and know what I've done since then so don't mind rewinding and starting over). In all cases, VM downtime is not much of an issue.

What I have now is a proxmox-multiuse directory on my zfs (called BigDrive here) which I made like this (thanks to some forum help: https://forum.proxmox.com/threads/add-backup-ability-to-main-storage-drive-newbie-question.145598/):

zp=BigDrive
myds=proxmox-multiuse

zfs create -o \
atime=off -o compression=lz4 -o recordsize=1024k \
$zp/$myds || exit 99

(Then: Datacenter -> Storage -> Add -> Directory)

This allows me to put .qcow2 .raw or .vmdk files into BigDrive/proxmox-multiuse (BigDrive is otherwise just zfs to me).

On determining that a box needed heavy enough lifting to require non-linear snapshot restoration I would run:

mkdir /BigDrive/proxmox-multiuse/images/100
qemu-img convert -O qcow2 -f raw /dev/zvol/BigDrive/vm-100-disk-0 /BigDrive/proxmox-multiuse/images/100/vm-100-disk-0.qcow2

qm disk rescan --vmid 100

I would probably use the GUI to then stop the VM, detach the original disk, attach the new one, check the boot order, and start the VM again, I guess this too would do it:

qm config 100 | grep size

(to determine the drive type and number)

qm stop 100
qm set 100 --scsi0 proxmox-multiuse:100/vm-100-disk-0.qcow2
qm start 100

Once I'm done doing my snapshots and rolling back indiscriminately, I would then backup my VM:

The VM -> Backup -> Backup Now (check "Protected") -> Backup

Then stop the VM and "Restore" the backup but with the original BigDrive (not proxmox-multiuse) selected for "Storage" and start the VM.

It's cumbersome and it's not perfect since I'm now starting with a box that's been rebooted twice since I started dealing with whatever the problem is (which in certain cases might itself be an issue), but that's best idea I have so far.

I just wonder if there's a better way?

Thank you.

RJD

ludwinator · May 1, 2024

Not sure what you mean by "non-linear" and what exactly you are trying to archieve, but to me it sounds like you are not familiar with zfs clone?
You can clone a zvol to wherever you like (same pool) and fire up a VM using that clone. Later you can zfs promote that clone and continue to work with that if you like.

ronejamesdo · May 1, 2024

So let's say I have a VM using zfs. I have taken three snapshots. I want to rewind to the first snapshot but (probably) only to check something before I proceed from the third snapshot. Using qcow2 I can just roll back and forth. With zfs I get this error:

Could I leverage "zfs clone" to help with this somehow? (Can you point me at specifics or provide an example?)

Thank you.

UdoB · May 1, 2024

I have no full example, but if your id of that vm is 100 try to run zfs list -t snapshot | grep 100. You will recognize your snapshots.

Each one if these can be used as the source to create a new zvol, man zfs-clone has the syntax. Just make sure to keep the naming scheme PVE relies on. The resulting volumes can then be used in a copy of your original "100.conf", just run cp /etc/pve/local/qemu-server/100.conf /etc/pve/local/qemu-server/99100.conf to formally create a copy and edit it; I tend to prefix such instances with a leading "99" or similar. You do need to change the MAC address, for some other values I am unsure. At the end you have your unmodified "100" and a manually created "99100" with the state cloned from the snapshot.

This works, but it is not really elegant and may be easily end in conflicts.

Now to my personal approach: snapshots are for short term usage and I have only a very,very of few them laying around. (Usualy I have one from yesterday, one from todays early morning and a few hourly".) The problem you describe is really unpleasant. But creating a new backup is also really fast nowadays. So I run "Backup now" manually as often as I run "Take Snapshot". It feels wrong, but backups are more flexibly than snapshots, when it comes to a large number of saved states.

ludwinator · May 1, 2024

Nothing beats cloning a snapshot when it comes to "probably only to check something before I proceed".
Cloning snapshots costs virtually no time, no space and no resources.
You could even attach a clone as another disk to an existing (running) machine and copy over single files from there, no need for backup/restore.

fmaione · Sunday at 08:25

ludwinator said:
Nothing beats cloning a snapshot when it comes to "probably only to check something before I proceed".
Cloning snapshots costs virtually no time, no space and no resources.
You could even attach a clone as another disk to an existing (running) machine and copy over single files from there, no need for backup/restore.

Interesting point, can you make an example with the key steps? Thank you.

I tested this and I think it's working:

I made a VM snapshot A (earlier) and then another one B (later) on the id702 VM (older, which I need to restore to a state earlier than the latest snapshot)
I cloned the VM in it's current state (child of B) creating a id703 VM (newer)

I renamed the id703 VM's volumes via zfs:

Bash:

root@epve:~# zfs rename rpool/data/vm-703-disk-0 rpool/data/vm-703-disk-0_newer
root@epve:~# zfs rename rpool/data/vm-703-disk-1 rpool/data/vm-703-disk-1_newer
root@epve:~# zfs rename rpool/data/vm-703-disk-2 rpool/data/vm-703-disk-2_newer

I cloned the id702 VM's volumes via zfs to id703 expected naming:

Bash:

root@epve:~# zfs clone rpool/data/vm-702-disk-0@apisnap rpool/data/vm-703-disk-0
root@epve:~# zfs clone rpool/data/vm-702-disk-1@apisnap rpool/data/vm-703-disk-1
root@epve:~# zfs clone rpool/data/vm-702-disk-2@apisnap rpool/data/vm-703-disk-2

I destroyed the unnecessary vols:

Code:

root@epve:~# zfs destroy rpool/data/vm-703-disk-0_newer
root@epve:~# zfs destroy rpool/data/vm-703-disk-1_newer
root@epve:~# zfs destroy rpool/data/vm-703-disk-2_newer

Now you can decide if the cloned volumes are going to permanent or not; I needed to be permanent so I promoted them with:

Bash:

root@epve:~# zfs promote rpool/data/vm-703-disk-0
root@epve:~# zfs promote rpool/data/vm-703-disk-1
root@epve:~# zfs promote rpool/data/vm-703-disk-2

Search

Search

zfs non-linear snapshots workaround ("can't roll back...is not most recent snapshot")

ronejamesdo

Member

ludwinator

New Member

ronejamesdo

Member

UdoB

Famous Member

ludwinator

New Member

fmaione

Member