PVE 6.3-3 not working discard

SaymonDzen

New Member
Aug 31, 2020
20
3
3
36
Moskow
VMs with Debian 10 or Ubuntu 20.04 configured according to instructions https://pve.proxmox.com/wiki/Shrink_Qcow2_Disk_Files#Linux_Guest_Configuration
fstrim -av writes that he successfully released about 10Gib
But! qcow2.img size not changed (VM fs size 7.7G and qcow2 size 20G)
After poweroff or reset VM fstim again writes that he successfully released about 10Gib.
I tried to migrate to another storages, because the original storage is ntfs. But after the migration it only gets worse, because the image grows to its maximum size and the diskard still doesn't work. Other storages ext4 over nfs and default local-lvm
 
Hi,
how exactly did you check the size of the qcow2 image? ls will show the full size even if it's sparse. You need to use du or qemu-img info to get the actually used size.
 
Hi!
i use du and ncdu. if du does not define correctly, then why is the image size not immediately maximum as after moving?
Will monitoring systems like zabbix correctly determine available disk space?
 
Using fstrim -va will report the potential discard, even if it's already free on the host.
From the man page:
Verbose execution. With this option fstrim will output the number of bytes passed from the filesystem down the block stack to the device for potential discard. This number is a maximum discard amount from the storage device's perspective, because FITRIM ioctl called repeated will keep sending the same sectors for discard repeatedly.

fstrim will report the same potential discard bytes each time, but only sectors which had been written to between the discards would actually be discarded by the storage device. Further, the kernel block layer reserves the right to adjust the discard ranges to fit raid stripe geometry, non-trim capable devices in a LVM setup, etc. These reductions would not be reflected in fstrim_range.len (the --length option).
The value doesn't mean that it successfully released that much space.

Just making sure: do you meet the requirements mentioned in the wiki? I.e., discard enabled on the virtual disk, scsi disk, etc.

What does qemu-img info /path/to/vm-ID-disk-N.qcow2 show before and after running the fstrim command?
 
VM config
conf.PNG
before
du -h /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
6,1G /mnt/windows/virt/images/100/vm-100-disk-0.qcow2

qemu-img info /mnt/windows/virt/images/100/vm-100-disk-0.qcow2

image: /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
file format: qcow2
virtual size: 32 GiB (34359738368 bytes)
disk size: 6.05 GiB
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false

added 2 bigfiles
root@elemetary:~# df -h /
Файл.система Размер Использовано Дост Использовано% Cмонтировано в
/dev/sda1 32G 11G 20G 34% /

root@pve0:~# du -h /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
9,3G /mnt/windows/virt/images/100/vm-100-disk-0.qcow2

root@pve0:~# qemu-img info /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
image: /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
file format: qcow2
virtual size: 32 GiB (34359738368 bytes)
disk size: 9.25 GiB
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false

removed files
root@elemetary:~# df -h /
Файл.система Размер Использовано Дост Использовано% Cмонтировано в
/dev/sda1 32G 6,5G 24G 22% /

root@elemetary:~# fstrim -av
/: 4,9 GiB (5193756672 bytes) trimmed

after trim
root@pve0:~# du -h /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
9,3G /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
root@pve0:~# qemu-img info /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
image: /mnt/windows/virt/images/100/vm-100-disk-0.qcow2
file format: qcow2
virtual size: 32 GiB (34359738368 bytes)
disk size: 9.25 GiB
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false

image size not changed after trim!
 
I'm not able to reproduce this here. Could you share the output of pveversion -v? What kernel is running in the VM? Did discard work in the past with the same configuration, and if so, with what version?
 
  • Like
Reactions: SaymonDzen
root@pve0:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

VM guest Linux debian10 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64

The problem reproduced on storages: ntfs disk (mount in fstab UUID = / mnt / windows ntfs-3g rw, uid = 1000, gid = 1000, dmask = 0002, fmask = 0003 0 0)
ext4 over nfs 4.0 mounted via web-gui pve
The problem did not reproduce
ext4 (/ var/lib/ vz)
lvmthin (default storage) qemu-img info does not show the current size. lvs showed Data% decreased after fstrim

OFFTOP
Why does web-gui pve mount nfs vers = 3 if 4 is specified in the settings?
mount shows storage.poklad.xyz:/nfs/proxmox on /mnt/pve/storage type nfs (rw, relatime, vers = 3, rsize = 32768, wsize = 32768, namlen = 255, hard, proto = tcp, timeo = 600, retrans = 2, sec = sys, mountaddr = 192.168.0.102, mountvers = 3, mountport = 892, mountproto = udp, local_lock = none, addr = 192.168.0.102)

nfs.PNG
 
The problem reproduced on storages: ntfs disk (mount in fstab UUID = / mnt / windows ntfs-3g rw, uid = 1000, gid = 1000, dmask = 0002, fmask = 0003 0 0)
ext4 over nfs 4.0 mounted via web-gui pve
The problem did not reproduce
ext4 (/ var/lib/ vz)
lvmthin (default storage) qemu-img info does not show the current size. lvs showed Data% decreased after fstrim
Ok, it turns out that NFS supports the needed "hole-punching" for sparse files (in this case the qcow2 file) only since version 4.2. And maybe the NTFS driver does not support it yet (I quickly tried and couldn't get it to work there either).

OFFTOP
Why does web-gui pve mount nfs vers = 3 if 4 is specified in the settings?
mount shows storage.poklad.xyz:/nfs/proxmox on /mnt/pve/storage type nfs (rw, relatime, vers = 3, rsize = 32768, wsize = 32768, namlen = 255, hard, proto = tcp, timeo = 600, retrans = 2, sec = sys, mountaddr = 192.168.0.102, mountvers = 3, mountport = 892, mountproto = udp, local_lock = none, addr = 192.168.0.102)

View attachment 22124
So this is actually not off-topic at all, and you would need to use 4.2 instead of 4 ;)
Did you umount the storage since changing the version? It will only have an effect on the next mount (PVE should re-mount it automatically after a few seconds).
 
Last edited:
  • Like
Reactions: SaymonDzen
To my regret, updating the NSF server to version 4.2 is not a trivial task, because it was installed on a soho NAS with an expired software support and there is only version 4.
I recreated the connection and reboot the proxmox, but it still connects with vers=3. When mounted via fstab with the defaults parameter, it connects with version 4.
 
Could you post the configuration for the NFS storage from /etc/pve/storage.cfg and the /etc/fstab entry?
 
Could you post the configuration for the NFS storage from /etc/pve/storage.cfg and the /etc/fstab entry?
cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content backup,vztmpl,iso,images
shared 0

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

dir: ssd
path /mnt/windows/virt
content images
shared 0

nfs: storage
export /nfs/proxmox
path /mnt/pve/storage
server storage.poklad.xyz
content images
options vers=4
prune-backups keep-all=1

cat /etc/fstab
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/pve/root / ext4 lazytime,errors=remount-ro 0 1
/dev/pve/swap none swap sw 0 0
proc /proc proc defaults 0 0
UUID=2C46EEE846EEB1AE /mnt/windows ntfs-3g rw,uid=1000,gid=1000,dmask=0002,fmask=0003 0 0
storage.local:/Share /mnt/share nfs defaults 0 0

mount | grep storage
storage.local:/Share on /mnt/share type nfs4 (rw,relatime,vers=4.0,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.106,local_lock=none,addr=192.168.0.102)
storage.poklad.xyz:/nfs/proxmox on /mnt/pve/storage type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.102,mountvers=3,mountport=892,mountproto=udp,local_lock=none,addr=192.168.0.102)

storage.poklad.xyz == storage.local
 
Could you try manually executing
Code:
mount -t nfs <server>:/nfs/proxmox /some/other/mountpoint -o vers=4
mount -t nfs <server>:/nfs/proxmox /some/other/other/mountpoint
and check the mount options afterwards?
 
  • Like
Reactions: SaymonDzen
Could you try manually executing
Code:
mount -t nfs <server>:/nfs/proxmox /some/other/mountpoint -o vers=4
mount -t nfs <server>:/nfs/proxmox /some/other/other/mountpoint
and check the mount options afterwards?
o_OIt looks like this is a feature of the NFS implementation on soho NAS

if mount <server>:/nfs/proxmox to be vers=3 regardless of options
if mount <server>:/proxmox to be vers=4
 
Sorry, I don't know. There seems to be a feature, but you'd have to test it to be sure. The feature you need is "hole-punching" for sparse files, because what matters here is reclaiming the space from the qcow2 file.
 
  • Like
Reactions: SaymonDzen
Hello guys,

I confirm the problem with .qcow2 on NFS v3 , v4 , and v4.1 between proxmox v6.2 and Synology NAS v6.4
Discarded files are not deleted inside the .qcow2 file on the NFS (confirmed with qemu-img info command).

Let's hope Synology will implement NFS v4.2 soon !
 
Last edited:
Hi,
Freebsd 13 came out with nfs4.2 but discard still doesn't work. I tested yesterday on OviOS system and fstrim works here (nfs4.2), and on freebsd with nfs4.2 it doesn't work. Proxmox mounts correctly with version 4.2.
I was hoping that, among other things, the code to handle the discard would be smuggled.

Or maybe I don't know something and is there a magic switch in freebsd that allows it?
On OviOS it works out of the box

Anyone have experience with nfs4.2 and discard on freebsd 13?
 
On freebsd with version 4.2 still does not work unfortunately.
I tested once on OviOS and there this functionality works.

link: ovios
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!