Low Performance on VMs stored over NFS

keroex

Member
Dec 1, 2021
5
0
6
40
Hi guys, I have a NAS connected to my Proxmox 7.3 cluster in which to store the VMs' disks. If I test the read/write speed in the nodes I have around 500-800MB/s but if I test the speed inside the VMs It's at 50-100MB/s.
The tests are carried out on both Windows and Linux VMs with the same results.
I have seen this topic in several posts but without an apparent solution.
Does anyone know of a possible solution?

Thx a lot
Regards
Federico
 
Don't forget that you are running a copy-on-write filesystem on top of your NFS share, as PVE uses qcow2 files to store virtual disks on it. Copy-on-Write always got a very big overhead.

Would be useful if you could share the config file of your VM you are testing it with.
And which tool do you use to benchmark it with what settings?

iSCSI might be faster, but I guess you need that NFS share for HA?
 
Last edited:
Don't forget that you are running a copy-on-write filesystem on top of your NFS share, as PVE uses qcow2 files to store virtual disks on it. Copy-on-Write always got a very big overhead.

Would be useful if you could share the config file of your VM you are testing it with.
And which tool do you use to benchmark it with what settings?

iSCSI might be faster, but I guess you need that NFS share for HA?
I benchmarking with CrystalDiskMark on windows and with dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync on linux.

I don't have iSCSI avaiable right now, but i'm trying get over fc to test.

Windows VM config file:

agent: 1
bios: ovmf
boot: order=ide0
cores: 18
efidisk0: pve-nfs-boot:101/vm-101-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide0: pve-nfs-boot:101/vm-101-disk-1.qcow2,size=100G
ide2: none,media=cdrom
machine: pc-i440fx-7.1
memory: 14096
meta: creation-qemu=7.1.0,ctime=1670850435
name: w2k22-secureos-prod-01
net0: e1000=6E:B1:7E:49:56:02,bridge=vmbr0,firewall=1,tag=770
numa: 0
ostype: win11
smbios1: uuid=877ac36d-1bb6-4024-89ec-9af69d86987d
sockets: 1
tpmstate0: pve-nfs-boot:101/vm-101-disk-0.raw,size=17K,version=v2.0
vmgenid: 1a99adb7-0279-4aab-a81d-ac217f998e6e

Linux VM config file:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 16
efidisk0: pve-nfs-boot:104/vm-104-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: none,media=cdrom
memory: 20480
meta: creation-qemu=7.1.0,ctime=1670592441
name: PruebaMint
net0: virtio=2E:B4:6C:EE:99:10,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: pve-nfs-boot:104/vm-104-disk-1.qcow2,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=8f1f6ce2-c033-4e7e-a2c5-7574173dfd89
sockets: 1
tpmstate0: pve-nfs-boot:104/vm-104-disk-0.raw,size=4M,version=v2.0
vga: virtio
vmgenid: da13d7be-4b78-4b89-ab24-c9ef0fcc6189
 
First have a look the windows best practices: https://pve.proxmox.com/wiki/Windows_10_guest_best_practices
You should install the QEMU guest agent and the virtio drivers in your Win VMs.
This would allow you to install a Windows VM with the way faster virtio SCSI virtual disk controller (instead of IDE) and virtio NIC (instead of E1000).

Then you can't compare benchmarks done with different software. Especially CrystalDiskMark is basically useless, as it only benchmarks your RAM, becaueof caching. Try to use fio as a benchmark tool to benchmark inside the WinVM and on the PVE host with the same settings for comparable results.
 
I benchmarking with CrystalDiskMark on windows and with dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync on linux.

I don't have iSCSI avaiable right now, but i'm trying get over fc to test.

Windows VM config file:

agent: 1
bios: ovmf
boot: order=ide0
cores: 18
efidisk0: pve-nfs-boot:101/vm-101-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide0: pve-nfs-boot:101/vm-101-disk-1.qcow2,size=100G
ide2: none,media=cdrom
machine: pc-i440fx-7.1
memory: 14096
meta: creation-qemu=7.1.0,ctime=1670850435
name: w2k22-secureos-prod-01
net0: e1000=6E:B1:7E:49:56:02,bridge=vmbr0,firewall=1,tag=770
numa: 0
ostype: win11
smbios1: uuid=877ac36d-1bb6-4024-89ec-9af69d86987d
sockets: 1
tpmstate0: pve-nfs-boot:101/vm-101-disk-0.raw,size=17K,version=v2.0
vmgenid: 1a99adb7-0279-4aab-a81d-ac217f998e6e

Linux VM config file:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 16
efidisk0: pve-nfs-boot:104/vm-104-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: none,media=cdrom
memory: 20480
meta: creation-qemu=7.1.0,ctime=1670592441
name: PruebaMint
net0: virtio=2E:B4:6C:EE:99:10,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: pve-nfs-boot:104/vm-104-disk-1.qcow2,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=8f1f6ce2-c033-4e7e-a2c5-7574173dfd89
sockets: 1
tpmstate0: pve-nfs-boot:104/vm-104-disk-0.raw,size=4M,version=v2.0
vga: virtio
vmgenid: da13d7be-4b78-4b89-ab24-c9ef0fcc6189
Ok, thz a lot, I installed a new VM following those best practices.
This are FIO results on Proxmox:

root@pve-gvip02:~# cd /mnt/pve/pve-nfs-boot/
root@pve-gvip02:/mnt/pve/pve-nfs-boot# fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.25
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=141625: Tue Dec 13 16:06:24 2022
write: IOPS=27.1k, BW=106MiB/s (111MB/s)(12.0GiB/116199msec); 0 zone resets
slat (nsec): min=449, max=303908, avg=1027.25, stdev=608.02
clat (nsec): min=165, max=65185M, avg=31101.72, stdev=36908712.34
lat (usec): min=5, max=65185k, avg=32.13, stdev=36908.74
clat percentiles (nsec):
| 1.00th=[ 5408], 5.00th=[ 5664], 10.00th=[ 5856], 20.00th=[ 6048],
| 30.00th=[ 6176], 40.00th=[ 6368], 50.00th=[ 6496], 60.00th=[ 6624],
| 70.00th=[ 6752], 80.00th=[ 7072], 90.00th=[ 7520], 95.00th=[ 7968],
| 99.00th=[13504], 99.50th=[19584], 99.90th=[23424], 99.95th=[24192],
| 99.99th=[27008]
bw ( KiB/s): min= 8, max=548848, per=100.00%, avg=426538.98, stdev=163550.33, samples=59
iops : min= 2, max=137212, avg=106634.85, stdev=40887.62, samples=59
lat (nsec) : 250=0.01%
lat (usec) : 2=0.01%, 10=98.60%, 20=0.93%, 50=0.47%, 100=0.01%
lat (usec) : 250=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 20=0.01%, 100=0.01%, 250=0.01%
lat (msec) : 500=0.01%, 2000=0.01%, >=2000=0.01%
cpu : usr=3.36%, sys=11.47%, ctx=3145819, majf=0, minf=435
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,3145729,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=106MiB/s (111MB/s), 106MiB/s-106MiB/s (111MB/s-111MB/s), io=12.0GiB (12.9GB), run=116199-116199msec


And this are fio results in the new Windows VM:

fio --name=random-write --ioengine=windowsaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=windowsaio, iodepth=1
fio-3.33
Starting 1 thread
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=7896: Tue Dec 13 18:59:03 2022
write: IOPS=3958, BW=15.5MiB/s (16.2MB/s)(3015MiB/194943msec); 0 zone resets
slat (usec): min=10, max=2127, avg=23.18, stdev= 8.56
clat (usec): min=3, max=227897, avg=38.62, stdev=271.44
lat (usec): min=13, max=228167, avg=61.80, stdev=271.99
clat percentiles (usec):
| 1.00th=[ 7], 5.00th=[ 19], 10.00th=[ 22], 20.00th=[ 25],
| 30.00th=[ 27], 40.00th=[ 29], 50.00th=[ 30], 60.00th=[ 34],
| 70.00th=[ 51], 80.00th=[ 58], 90.00th=[ 63], 95.00th=[ 71],
| 99.00th=[ 90], 99.50th=[ 98], 99.90th=[ 117], 99.95th=[ 129],
| 99.99th=[ 277]
bw ( KiB/s): min=16912, max=79778, per=100.00%, avg=51515.45, stdev=9674.82, samples=119
iops : min= 4228, max=19944, avg=12878.57, stdev=2418.68, samples=119
lat (usec) : 4=0.01%, 10=2.04%, 20=4.52%, 50=63.09%, 100=29.93%
lat (usec) : 250=0.41%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
lat (msec) : 250=0.01%
cpu : usr=1.54%, sys=21.03%, ctx=0, majf=0, minf=0
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,771776,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=15.5MiB/s (16.2MB/s), 15.5MiB/s-15.5MiB/s (16.2MB/s-16.2MB/s), io=3015MiB (3161MB), run=194943-194943msec

Any thoughts or ideas?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!