Space reclamation on thin provisioning after removing file from debian 13 with qcow2 disk

piotrpierzchala

New Member
Jun 12, 2025
14
0
1
Hello!

I'm trying to figure out what exactly and when space reclamation kicks in on thin provisioning after removing file from debian 13 with qcow2 disk.

Setup:
- test system: Debian 13 (system installed on ext4)
- additional disk inside debian: 100 GB (.qcow2, partition inside system formated with ext4)
- VM's SCSI Controller: VirtIO SCSI single
- datastore: NFS 4.1 and NFS 4.2 (created from TrueNAS)
- QEMU Guest Agent: Disabled
- qcow2 option: discard=on

Test: 1

1. Create file with dd:
Code:
dd if=/dev/urandom of=thick_test_random.img bs=1M count=10000 status=progress

2. Check size of .qcow2 file from proxmox node side:
Code:
qemu-img info vm-102-disk-1.qcow2

3. Check space on datastores from proxmox node side:
Code:
pvesm status

4. Delete test file:
Code:
rm thick_test_random.img

5. Perform filesystem trim with (triggered multiple times - just to be sure):
Code:
fstrim -av

(at that point system says, it managed to trim the space ~10 GiBs)

6. Check again .qcow2 file from proxmox node side:
Code:
qemu-img info vm-102-disk-1.qcow2

7. Check again space on datastores from proxmox node side:
Code:
pvesm status

Result:
No changes in terms of size of .qcow2 file or from datastore point of view. In general, size of that virtual disk should go down to just a few MiBs which is NOT happening.


... as additional:

8. Migrate test virtual disk to different datastore:
Code:
qm move-disk 102 scsi1 <TARGET_DATASTORE> --format qcow2 --delete

Result:
Again nothing...


Above test has been performed on NFS version 3, version 4.1 and version 4.2 and virtual disks have been moved around several times.

... as additional:

9. Enable QEMU Guest Agent in VM's "Options"

10. Perform fstrim -av
11. Migrate test virtual disk to different datastore

Result:
Finally - size of the virtual disk went down to few MiBs.


The best part starts now - after recreating the file with dd inside debian, removing it again, performing fstrim and migrating VM's disk to different datastore, this time, space reclamation hasn't kicked in - and I've reply that test multiple times.

I was also NOT able to reclaim that space after deleting the test file WITHOUT moving virtual disk to different datastore - which from operational point of view is NOT the best scenario - we cannot ask users to inform us everytime they delete something from insidie of the system so that we could move VM around, to reclaim the space on datastore.

Sometimes that space reclamation works but somtimes it doesn't.

... as additional:

12. After enabling QEMU Guest Agent and deleting the test file from inside of the debian, I've tried to run "qm guest cmd 102 fstrim" but that doesn't help.

In the end, I've managed to reclaim that space from datastore point of view, but it is NOT predictable of how to do it.

Q:
Do I have to move VM's disk to reclaim that space every time?

Q:
Is it possible to reclaim that space without moving VM's disk?
 
Last edited:
Hi,

We are running into the exact same issues, and are asking the exact same questions, with the only difference in setup being that we use NetApp for NFS. I've came to the same conclusion as you. Space reclamation works, but only sometimes, not always.

My test setup:
- test system: Ubuntu 24.04 (default system disk layout, lvm +ext4)
- additional disk inside ubuntu: 100 GB (.qcow2, lvm with ext4)
- VM's SCSI Controller: VirtIO SCSI single
- datastore: Tested on NFS 4.1 and NFS 4.2 , both with nconnect=8 (NetApp with all flash) - preallocation=off
- QEMU Guest Agent: Enable
- qcow2 option: discard=on, SSD emulation=on, IO thread=on, Async IO=Default

Trying to dig into this to see if theres any explanation...
 
Are you really checking the blocksize or just the logical size?

After you trim, please compare the following:

  • ls -l image.qcow2
  • ls -ls image.qcow2
  • stat image.qcow2

Could you post these information after you fstrimmed but before you moved the VM?
 
Last edited:
Are you really checking the blocksize or just the logical size?

After you trim, please compare the following:

  • ls -l image.qcow2
  • ls -ls image.qcow2
  • stat image.qcow2

Could you post these information after you fstrimmed but before you moved the VM?
Steps done:
  • Write random data to disk using fio up to 40GB
  • Delete the data

Before trimming, or moving vm-disk
Inside the VM
Code:
root@ubuntu2404:~# df -h /dev/mapper/data_vg-data_lv
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/data_vg-data_lv   98G   24K   93G   1% /mnt/data
PVE host:
Code:
root@HK-PVE:/mnt/pve/NFS/images/1337# ls -l vm-1337-disk-1.qcow2
-rw-r----- 1 nobody nogroup 38528811008 Feb 13 11:30 vm-1337-disk-1.qcow2

root@HK-PVE:/mnt/pve/PVE/images/1337# ls -ls vm-1337-disk-1.qcow2
37773872 -rw-r----- 1 nobody nogroup 38528811008 Feb 13 11:30 vm-1337-disk-1.qcow2

root@HK-PVE:/mnt/pve/PVE/images/1337# stat vm-1337-disk-1.qcow2
  File: vm-1337-disk-1.qcow2
  Size: 38528811008     Blocks: 75547744   IO Block: 65536  regular file
Device: 0,53    Inode: 99          Links: 1
Access: (0640/-rw-r-----)  Uid: (65534/  nobody)   Gid: (65534/ nogroup)
Access: 2026-02-13 11:30:11.711233000 +0100
Modify: 2026-02-13 11:30:11.721267000 +0100
Change: 2026-02-13 11:30:11.721267000 +0100
 Birth: 2026-02-13 08:10:46.026584000 +0100

After timming, before moving vm-disk
Code:
root@ubuntu2404:~# fstrim -av
/mnt/data: 80.1 GiB (85978292224 bytes) trimmed on /dev/mapper/data_vg-data_lv
/boot/efi: 1 GiB (1118560256 bytes) trimmed on /dev/sda1
/boot: 0 B (0 bytes) trimmed on /dev/sda2
/: 1.4 GiB (1525108736 bytes) trimmed on /dev/mapper/ubuntu--vg-ubuntu--lv

root@HK-PVE:/mnt/pve/NFS/images/1337# ls -l vm-1337-disk-1.qcow2
-rw-r----- 1 nobody nogroup 38528811008 Feb 13 11:30 vm-1337-disk-1.qcow2

root@HK-PVE:/mnt/pve/NFS/images/1337# ls -ls vm-1337-disk-1.qcow2
37773872 -rw-r----- 1 nobody nogroup 38528811008 Feb 13 11:30 vm-1337-disk-1.qcow2

root@HK-PVE:/mnt/pve/NFS/images/1337# stat vm-1337-disk-1.qcow2
  File: vm-1337-disk-1.qcow2
  Size: 38528811008     Blocks: 75547744   IO Block: 65536  regular file
Device: 0,53    Inode: 99          Links: 1
Access: (0640/-rw-r-----)  Uid: (65534/  nobody)   Gid: (65534/ nogroup)
Access: 2026-02-13 11:30:11.711233000 +0100
Modify: 2026-02-13 11:30:11.721267000 +0100
Change: 2026-02-13 11:30:11.721267000 +0100
 Birth: 2026-02-13 08:10:46.026584000 +0100

root@HK-PVE:/mnt/pve/NFS/images/1337# qemu-img info vm-1337-disk-1.qcow2
image: vm-1337-disk-1.qcow2
file format: qcow2
virtual size: 100 GiB (107374182400 bytes)
disk size: 36 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-1337-disk-1.qcow2
    protocol type: file
    file length: 35.9 GiB (38528811008 bytes)
    disk size: 36 GiB
 
There results after trimming, before moving VM's disk

Code:
root@:/mnt/pve/truenas01-41-test2-01/images/102# ls -l vm-102-disk-1.qcow2
-rw-r----- 1 root nogroup 16108814336 Feb 13 12:54 vm-102-disk-1.qcow2

Code:
root@:/mnt/pve/truenas01-41-test2-01/images/102# ls -ls vm-102-disk-1.qcow2
5125421 -rw-r----- 1 root nogroup 16108814336 Feb 13 12:54 vm-102-disk-1.qcow2

Code:
root@:/mnt/pve/truenas01-41-test2-01/images/102# stat vm-102-disk-1.qcow2
  File: vm-102-disk-1.qcow2
  Size: 16108814336     Blocks: 10250841   IO Block: 1048576 regular file
Device: 0,54    Inode: 259         Links: 1
Access: (0640/-rw-r-----)  Uid: (    0/    root)   Gid: (65534/ nogroup)
Access: 2026-02-12 15:07:20.649823513 +0000
Modify: 2026-02-13 12:54:27.414603007 +0000
Change: 2026-02-13 12:54:27.414603007 +0000
 Birth: 2026-02-12 15:07:20.649823513 +0000

Code:
root@:/mnt/pve/truenas01-41-test2-01/images/102# qemu-img info vm-102-disk-1.qcow2
image: vm-102-disk-1.qcow2
file format: qcow2
virtual size: 15 GiB (16106127360 bytes)
disk size: 4.89 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-102-disk-1.qcow2
    protocol type: file
    file length: 15 GiB (16108814336 bytes)
    disk size: 4.89 GiB

'pvesm status' also doesn't show any changes.

... just for clarity - in a scenario where finally the space is reclaimed both 'pvesm status' and "qemu-ing info" shows the change.
 
There is one more thing which I have to write down... while at some point "trim" and space reclamation will eventually work (we just cannot predict at what point and what is really triggering it), I haven't been able until now to reclaim that space WITHOUT QEMU Guest Agent.

Enabling QEMU Guest Agent on VM is some kind of pre-prerequisite. I don't think this is written anywhere in the documentation.
 
Last edited:
@AntonJ
I think I've managed to reclaim space without moving VM's disk.

Create file with dd, delete it, run fstrim, all this few time and it works.

Would be nice if you could confirm from your end.

It appears that it matters on which NFS (4.1 or 4.2) you actually place VM's disk in first place, before you power virtual machine ON (with Guest Agent being "enabled"). It actually works even without Guest Agent being enabled (from time to time), but for some reason when Guest Agent is "enabled" it works every time.

So do it like this...

note: below performed with SSD Emulation "disabled".

1. power OFF virtual machine
2. disable Guest Agent
3. power ON virtual machine with all disks on NFS 4.1
4. power it back OFF

--- at this point the real thing begins

5. move VM's disk (.qcow2 file) onto NFS 4.2
6. power ON virtual machine
7. create file with 'dd if=/dev/urandom of=thick_test_random.img bs=1M count=5000 status=progress'
8. check .qcow file size with qemu-img info <FILE>

--- at this point .qcow file should grow up +5 GB

9. remove thick_test_random.img
10. do 'fstrim -av'

--- at this point .qcow file should go down -5 GB

if above will not happen:

11. power OFF virtual machine
12. power ON virtual machine
13. do 'fstrim -av'
14. check .qcow file size with qemu-img info <FILE> (few times)

Now...

If above will not work:

15. power OFF virtual machine
16. enable Guest Agent
17. power ON virtual machine
18. create file with 'dd if=/dev/urandom of=thick_test_random.img bs=1M count=5000 status=progress'
19. remove thick_test_random.img
20. do 'fstrim -av'
21. check .qcow file size with qemu-img info <FILE> (few times)

-- at this point, after enabling Guest Agent, 'fstrim -av' works everytime, even after powercycling VM.

note: once you move disk from NFS 4.2 to NFS 4.1 online, above will stop working.

Code:
root@:/mnt/pve/truenas01-42-test3-01/images/103# qemu-img info vm-103-disk-0.qcow2
image: vm-103-disk-0.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 2.8 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-103-disk-0.qcow2
    protocol type: file
    file length: 10 GiB (10739318784 bytes)
    disk size: 2.8 GiB

root@:/mnt/pve/truenas01-42-test3-01/images/103# qemu-img info vm-103-disk-0.qcow2
image: vm-103-disk-0.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 1.07 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-103-disk-0.qcow2
    protocol type: file
    file length: 10 GiB (10739318784 bytes)
    disk size: 1.07 MiB
 
Last edited:
@AntonJ
I think I've managed to reclaim space without moving VM's disk.

Create file with dd, delete it, run fstrim, all this few time and it works.

Would be nice if you could confirm from your end.

It appears that it matters on which NFS (4.1 or 4.2) you actually place VM's disk in first place, before you power virtual machine ON (with Guest Agent being "enabled"). It actually works even without Guest Agent being enabled (from time to time), but for some reason when Guest Agent is "enabled" it works every time.

So do it like this...

note: below performed with SSD Emulation "disabled".

1. power OFF virtual machine
2. disable Guest Agent
3. power ON virtual machine with all disks on NFS 4.1
4. power it back OFF

--- at this point the real thing begins

5. move VM's disk (.qcow2 file) onto NFS 4.2
6. power ON virtual machine
7. create file with 'dd if=/dev/urandom of=thick_test_random.img bs=1M count=5000 status=progress'
8. check .qcow file size with qemu-img info <FILE>

--- at this point .qcow file should grow up +5 GB

9. remove thick_test_random.img
10. do 'fstrim -av'

--- at this point .qcow file should go down -5 GB

if above will not happen:

11. power OFF virtual machine
12. power ON virtual machine
13. do 'fstrim -av'
14. check .qcow file size with qemu-img info <FILE> (few times)

Now...

If above will not work:

15. power OFF virtual machine
16. enable Guest Agent
17. power ON virtual machine
18. create file with 'dd if=/dev/urandom of=thick_test_random.img bs=1M count=5000 status=progress'
19. remove thick_test_random.img
20. do 'fstrim -av'
21. check .qcow file size with qemu-img info <FILE> (few times)

-- at this point, after enabling Guest Agent, 'fstrim -av' works everytime, even after powercycling VM.

note: once you move disk from NFS 4.2 to NFS 4.1 online, above will stop working.

Code:
root@:/mnt/pve/truenas01-42-test3-01/images/103# qemu-img info vm-103-disk-0.qcow2
image: vm-103-disk-0.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 2.8 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-103-disk-0.qcow2
    protocol type: file
    file length: 10 GiB (10739318784 bytes)
    disk size: 2.8 GiB

root@:/mnt/pve/truenas01-42-test3-01/images/103# qemu-img info vm-103-disk-0.qcow2
image: vm-103-disk-0.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 1.07 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: vm-103-disk-0.qcow2
    protocol type: file
    file length: 10 GiB (10739318784 bytes)
    disk size: 1.07 MiB
I replicated it using the same steps and set it up with SSD emulation=off; however, I did not get the same result.

  • Steps 1–8: .qcow grows +5GB
  • Steps 9–10: .qcow does not shrink
  • Steps 11–14: .qcow does not shrink
  • Steps 15–18: .qcow grows with an additional +5GB
  • Steps 19–21: .qcow does not shrink
Power cycling after this and then issuing fstrim -av doesn't work either.

However, after running all these steps, I migrated the .qcow to another NFS volume, also mounted with 4.2. This resulted in the disk shrinking as it's supposed to.

I also repeated all the steps a second time but skipped moving the disk to NFS mounted with 4.2 and ran everything on NFS mounted with 4.1. This yielded the exact same result: the .qcow disk did not shrink when issuing fstrim -av, but then actually shrank when it was moved to another volume (moved to another NFS 4.1).

I then did everything once again, this time with SSD emulation=on, and got the exact same result as above. I will try to do some more tests tomorrow and might try on NFS3 as well.
 
Last edited:
I replicated it using the same steps and set it up with SSD emulation=off; however, I did not get the same result.

  • Steps 1–8: .qcow grows +5GB
  • Steps 9–10: .qcow does not shrink
  • Steps 11–14: .qcow does not shrink
  • Steps 15–18: .qcow grows with an additional +5GB
  • Steps 19–21: .qcow does not shrink
Power cycling after this and then issuing fstrim -av doesn't work either.

However, after running all these steps, I migrated the .qcow to another NFS volume, also mounted with 4.2. This resulted in the disk shrinking as it's supposed to.

I also repeated all the steps a second time but skipped moving the disk to NFS mounted with 4.2 and ran everything on NFS mounted with 4.1. This yielded the exact same result: the .qcow disk did not shrink when issuing fstrim -av, but then actually shrank when it was moved to another volume (moved to another NFS 4.1).

I then did everything once again, this time with SSD emulation=on, and got the exact same result as above. I will try to do some more tests tomorrow and might try on NFS3 as well.
I've done some more testing, and reading, but still haven't gotten it to work.

I ran everythign once again, but this time tested: NFSv4.1 with preallocation=metdata and preallocation=off, NFSv4.2 with preallocatio=metadata and preallocation=off. Both yield the same result as above. In no case whatsoever am I able to get the vmdisk.qcow2 to shrink it's size after it have grown. The only thing that works is to do a storage migration to another volume, this shrinks it back to what is actually being used.

Reading up on what discard actually does and piecing together bunch off stuff I think the issue might be the NFS server - but I don't yet know how to confirm this. I will verify with our vendor what actually should happen.

This part of the documentations, even if its in the backup section, mention it only working on 4.2 pve-docs.
There is also these discussions mention the same discard-over-nfs-not-working and TRIM over iSCSI/NFS. And the RFC parts in question that needs to be supported.. rfc7862

Will keep testing to see if I can figure anything out, although having the workaround being a disk migration once in a while ain't to bad. But this "should just workTM" =)
 
I've done some more testing, and reading, but still haven't gotten it to work.

I ran everythign once again, but this time tested: NFSv4.1 with preallocation=metdata and preallocation=off, NFSv4.2 with preallocatio=metadata and preallocation=off. Both yield the same result as above. In no case whatsoever am I able to get the vmdisk.qcow2 to shrink it's size after it have grown. The only thing that works is to do a storage migration to another volume, this shrinks it back to what is actually being used.

Reading up on what discard actually does and piecing together bunch off stuff I think the issue might be the NFS server - but I don't yet know how to confirm this. I will verify with our vendor what actually should happen.

This part of the documentations, even if its in the backup section, mention it only working on 4.2 pve-docs.
There is also these discussions mention the same discard-over-nfs-not-working and TRIM over iSCSI/NFS. And the RFC parts in question that needs to be supported.. rfc7862

Will keep testing to see if I can figure anything out, although having the workaround being a disk migration once in a while ain't to bad. But this "should just workTM" =)
For my part this explains everything. It's not implemented on the NetApp side yet. And NFS 4.2 is need afaik.

https://kb.netapp.com/on-prem/ontap/da/NAS/NAS-Issues/CONTAP-178914
 
Well... if you still have some spare time, just try to power OFF VM, migrate all disks on NFS 4.2 datastore, power it back ON. Create a file, delete it, run 'fstrim -av' and check qemu-img info.

Space reclamation works every time without migration of .qcow2 files to different datastore.

Actually it works until you migrate disks to different datastore. It doesn't matter if it will NFS 4.1 or NFS 4.2. After migration, it stops working (until you power OFF and back ON VM).

Not sure if there is a way to explain this, perhaps proxmox devs would try to explain why after migration to different datastore, trim stops effectively working and size shown by qemu-img info doesn't change.
 
Last edited:
Right...

Before moving disk (with 'qm move-disk')

Code:
qm monitor
('info block')

shows that specific disk uses '(...) "driver": "qcow2" (...)'

after running 'qm move-disk' it shows: '(...) "driver": "zeroinit" (...)'...

...which I believe is a "problem".

Power OFF/back ON, clears the situation and driver goes back to "qcow2".

Is this expected?

It feels like it should switch back to "qcow2" after 'qm move-disk'?