Proxmox file based shared storage questions

A.C.

New Member
May 30, 2025
2
0
1
hello community,

just starting to play around with proxmox ve with various configuration including file based shared storage (CIFS and NFS) and i've got some problems with this kind of setup.

my lab is built as following:

three node proxmox ve 8.4.1 hyperconverged cluster where every node is connected to a single 10gbe switch with a dual 10gbe nic channel bond

a file based storage (netapp aff a200) connected to the same switch via 10gbe channel bond with CIFS & NFS shares exposed to proxmox cluster

i did some testing on this setup and i've got the following results:

  • read /writes i/o performance are good on both CIFS and NFS, but better on CIFS with less load on storage CPU
  • writes i/o performance when VM is under snapshot are bad on NFS and good on CIFS with iops falling deeply on NFS (but Netapp storage still show high i/o activity)
  • snapshot removal takes a long time with NFS with vm freezed and inaccesible from both console & network, while everything is much more smooth and fast with CIFS
  • clone operation from CEPH backed or locally hosted VM to NFS storage works well, while the same VM cloned to CIFS share ends up in a corrupted QCOW2 file (i/o error accessing the file even from console)
  • restoring vm backup to both CIFS and NFS works as expected

i've tried to change preallocation policy for qcow2 files but both "full" options are unusable as QCOW2 creation ends up with errors on both CIFS and NFS
for NFS tried both V3 and V4.x protocol versions with same results

are those behaviours expected on Proxmox VE with this kind of setup? they seems a bit strange to me as NFS should be a better option on UNIX systems than CIFS..

Kind regards

Alberto
 
read /writes i/o performance are good on both CIFS and NFS, but better on CIFS with less load on storage CPU
On pve-host (nfs client): echo 8192 > /sys/class/bdi/$(mountpoint -d /mnt/<netapp_mount>)/read_ahead_kb # replace with your path !!
writes i/o performance when VM is under snapshot are bad on NFS and good on CIFS with iops falling deeply on NFS (but Netapp storage still show high i/o activity)
Netapp (2sec, use of raw files possible too) or qcow2 snapshot ?
snapshot removal takes a long time with NFS with vm freezed and inaccesible from both console & network, while everything is much more smooth and fast with CIFS
Assuming qcow2 as snapshot removal on Netapp not possible this way from pve and even done automatically (configured).
i've tried to change preallocation policy for qcow2 files but both "full" options are unusable as QCOW2 creation ends up with errors on both CIFS and NFS
for NFS tried both V3 and V4.x protocol versions with same results
?? For creating any new vm disk (os / option) it ends with errors ??
are those behaviours expected on Proxmox VE with this kind of setup? they seems a bit strange to me as NFS should be a better option on UNIX systems than CIFS..
Did you use "unix-style" filesystem or "nt-style" ? Nfs is definitive faster but needs read_ahead setting on nfs client (=pve).
 
And do the nfs mount with option nconnect=2 or 4 for a bonded 2x 10Gbit network !
 
And do the nfs mount with option nconnect=2 or 4 for a bonded 2x 10Gbit network !
On pve-host (nfs client): echo 8192 > /sys/class/bdi/$(mountpoint -d /mnt/<netapp_mount>)/read_ahead_kb # replace with your path !!

Netapp (2sec, use of raw files possible too) or qcow2 snapshot ?

Assuming qcow2 as snapshot removal on Netapp not possible this way from pve and even done automatically (configured).

?? For creating any new vm disk (os / option) it ends with errors ??

Did you use "unix-style" filesystem or "nt-style" ? Nfs is definitive faster but needs read_ahead setting on nfs client (=pve).

some further info:

security style in unix for nfs volume and Windows for cifs volume

i always refer to qcow2 snapshots and not Ontap snap.

the slowliness related to qcow2 snapshots seems to be related to Intel I/o meter usage inside the vm; creating a snapshot during high write activity seems to screw up someting inside the qcow2 file and subsequent snapshots takes a lot of time for creation/removal even with no i/o inside the vm, the same beahaviour dosen't occur with normal vm disk activity like copy of files or disk bench tools like Crystaldiskmark (i've created then removed a few snapshots during write activity)

with CEPH backend the problem dosen't occur as snapshot creation/removal is quick even on high i/o

tuning NFS parameters as idicated improved performaces a bit but not significantly

regarding errors when clonig vms within NFS datastore with full provision this is what i get:

Formatting '/mnt/pve/NFS_vol01/images/123/vm-123-disk-0.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=full compression_type=zlib size=42949672960 lazy_refcounts=off refcount_bits=16
TASK ERROR: clone failed: unable to create image: 'storage-NFS_vol01'-locked command timed out - aborting

for CIFS the clone process complete with no error but the qcow2 is corrupted and cannot be accessed (i/o error)
 
Last edited:
We do pve nfs image reflinks directly on rocky9 hw-raid6 xfs with vm/lxc freeze/unfreeze, independent of qcow2, raw or tpm disks, same for clone, so all in under a handful of seconds and never issues. A Netapp is not normal linux distro with a shell ...
 
  • Like
Reactions: Johannes S