Deleting snapshots extremly slow and VM's freeze

netbela · Saturday at 19:05

Hi there,

I've been using Proxmox for some time now and am quite happy with it, however I do have a single issue with my VM's and snapshots.
If I delete a snapshot that is even just a day-old it'll take quite a long time to delete the snapshot (up to 2 minutes). In the meanwhile the VM is 'frozen' and completely unresponsive.
The VM's are all stored on ZFS storage served from TrueNAS over NFS.

Now I've read somewhere that this can be the result of using slow storage amongst others but I dont think that my storage solution is that slow. Storage specs:
Data VDEVs
2 x RAIDZ1 | 4 wide | 3.49 TiB --- (All: Samsung PM1633a MZILS3T8HMLH0D4 3.84TB SAS 12Gb/s or equivalent)
Log VDEVs
1 x DISK | 1 wide | 260.83 GiB -- (Intel 900P Optane (SSDPED1D280GA)

The NAS has a 40Gbit uplink and the hypervisors all have a 25Gbit link, MTU is set to 9000 and the storage speed is quite OK, here is a `yabs` result:

Code:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition 10.0.75.10:/mnt/zfs-01/px-nfs):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 122.97 MB/s  (30.7k) | 441.55 MB/s   (6.8k)
Write      | 123.29 MB/s  (30.8k) | 443.87 MB/s   (6.9k)
Total      | 246.27 MB/s  (61.5k) | 885.43 MB/s  (13.8k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 493.56 MB/s    (964) | 523.89 MB/s    (511)
Write      | 519.79 MB/s   (1.0k) | 558.78 MB/s    (545)
Total      | 1.01 GB/s     (1.9k) | 1.08 GB/s     (1.0k)

So, te storage is quite performant I would say.

Here is the config of a example VM that is having issues:

Code:

root@am-prm-01:/mnt/pve/px-nfs# qm config 138
agent: 1
balloon: 4096
boot: order=scsi0;ide2;net0
cores: 24
cpu: EPYC
hotplug: disk,network,usb
ide2: none,media=cdrom
memory: 49152
meta: creation-qemu=9.0.2,ctime=1751440419
name: xxx
net0: virtio=BC:24:11:33:5A:8A,bridge=vmbr50,firewall=1
numa: 0
ostype: l26
parent: NTB-pre-update-20260504T061508685003
scsi0: px-nfs:138/vm-138-disk-0.qcow2,discard=on,iothread=1,size=200G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=d474f662-80ec-48a1-8898-ec8aa0933921
sockets: 1
tags: ntb-mgmt
vmgenid: 7e9b01ee-5ccb-4045-ae19-3c5af511b596

Is the combination of qcow2 + NFS my issue here?

I have some automated jobs in-place where (on a weekly basis) snapshot all VM's, update them, reboot them and 2 days later delete the snapshots. However the downtime I now have because of the snapshot removal is killing me.

UdoB · Saturday at 20:08

netbela said:
s the combination of qcow2 + NFS my issue here?

Probably: yes.

If I remember correctly removing a snapshot means to integrate the modified blocks (since the point in time when the snapshot was taken) into one single file. That's plain and simple a slow operation. Problems like like this result in the recommendation to use modern block-devices, not files for VM images.

Disclaimer: I do not use qcow2 and my memory may be unreliable...

waltar · Sunday at 08:05

Do the snapshots in your truenas zfs and even use raw files instead of qcow for your images over nfs.

netbela · Sunday at 08:34

waltar said:
Do the snapshots in your truenas zfs and even use raw files instead of qcow for your images over nfs.

I already create snapshots from the TrueNAS side, but those are not (easily) restorable from the Proxmox side

Using RAW is not a viable solution since that'll lose my ability for Thin-provisioning and VM Snapshots.

waltar · Sunday at 10:37

Thin provisioning is a form of resource oversubscription—specifically, simulating storage that one does not actually possess. Running out of space will give you much more problems as you think of the pros while even cost performance. Restoring a VM snapshot is simply a copy operation from the dataset's `.zfs` directory.

Impact · Sunday at 12:54

ISCSI might be an option or do some form of local storage.

uzumo · Sunday at 14:08

netbela said:
but those are not (easily) restorable from the Proxmox side

By using ZFS over iSCSI, you can create and restore snapshots via the API without directly accessing Truenas, and thin provisioning is also available.

I recommend looking into it and testing to see if it behaves as you expect.

alma21 · Sunday at 14:43

if you are using PVE >= 9.0 you can try the new "snapshot-as-volume-chain" feature ..... it should perform better on snapshot creation/deletion then internal qcow2 snapshots .... caveat: still technology preview .... qcow2 on ZFS (CoW on CoW) is also not an ideal setup

netbela · Monday at 08:31

I've tried installing the TrueNAS Proxmox plugin (https://github.com/truenas/truenas-proxmox-plugin) and was able to configure it with a different TrueNAS system for now (need to update my primairy one). It seems to work quite well and fast also. I'll update my primairy node later this week and see if switching from NFS to ZFS over ISCSI solves my issues.

netbela · 2026-05-15T08:31:55+0200

Alright, so i've updated my TrueNAS system to 25.10 and tested with the ZFS-over-ISCSI using the plugin describe above. I assumed I would've get more performance, but the results do not lie.

ZFS-over-ISCSI results:

Code:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/sda1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 75.75 MB/s   (18.9k) | 417.57 MB/s   (6.5k)
Write      | 75.95 MB/s   (18.9k) | 419.77 MB/s   (6.5k)
Total      | 151.71 MB/s  (37.9k) | 837.35 MB/s  (13.0k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 537.45 MB/s   (1.0k) | 57.99 MB/s      (56)
Write      | 566.01 MB/s   (1.1k) | 62.62 MB/s      (61)
Total      | 1.10 GB/s     (2.1k) | 120.62 MB/s    (117)

NFS Results (after updating TrueNAS):

Code:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/sda1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 159.02 MB/s  (39.7k) | 1.22 GB/s    (19.1k)
Write      | 159.44 MB/s  (39.8k) | 1.23 GB/s    (19.2k)
Total      | 318.47 MB/s  (79.6k) | 2.45 GB/s    (38.3k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 1.80 GB/s     (3.5k) | 1.74 GB/s     (1.7k)
Write      | 1.90 GB/s     (3.7k) | 1.86 GB/s     (1.8k)
Total      | 3.70 GB/s     (7.2k) | 3.60 GB/s     (3.5k)

I do not see any reason to use ZFS-over-ISCSI in this-case. I'll see if the issues I had with snapshot deletion are solved since the NFS-mounted storage seems to be MUCH faster than it was before.

fiona · 2026-05-15T11:20:09+0200

Hi,

netbela said:
Is the combination of qcow2 + NFS my issue here?

as others already pointed out, yes (from the docs):

2: On file based storages, snapshots are possible with the qcow2 format,either using the internal snapshot function, or snapshots as volume chains4.Creating and deleting internal qcow2 snapshots will block a running VM andis not an efficient operation. The performance is particularly bad with networkstorages like NFS. On some setups and for large disks (multiple hundred GiB orTiB sized), these operations may take several minutes, or in extreme cases, evenhours. If your setup is affected, create and remove snapshots while the VM isshut down, expecting a long task duration.

ZFS over iSCSI is recommended in such a case, rather than adding two more layers with NFS and qcow2.

Search

Search

Deleting snapshots extremly slow and VM's freeze

netbela

Member

Attachments

UdoB

Distinguished Member

waltar

Famous Member

netbela

Member

waltar

Famous Member

Impact

Distinguished Member

uzumo

Well-Known Member

alma21

Member

netbela

Member

netbela

Member

fiona

Proxmox Staff Member

We value your privacy