Poor performance over NFS

skandar77

New Member
Jun 10, 2022
5
1
3
Hello all,

We have a following Proxmox setup:

One storage server, HP DL380P G8, CPU Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz, 32GB RAM
Storage Pool assembled on 8xSSD, Samsung SSD 870 QVO 1TB, which are put into the four mirrors with 2 drives each, and then mirrors are setup into the stripe, using ZFS

OS: OpenMediaVault (Debian based 6.0.28-3 (Shaitan)
Kernel: Linux 5.16.0-0.bpo.4-amd64
NFS: nfs-kernel-server/stable,now 1:1.3.4-6 amd64 [installed] - Ver 3.

Local storage speed on this server is measured via dd is at 2GB/Second

This storage is connected to three Proxmox virtualization nodes, via X520-DA1 (1x SFP+) 10GBit network cards through Mikrotik CRS317-1G-16S+, over NFS. Each virtualization node has the same network card as a storage server

Proxmox versions tested: 6.4-4, 7.2-4
pve-manager/6.4-4/337d6701 (running kernel: 5.4.106-1-pve)

Network connection speed measured via iperf from the virtualization machine to the storage at 7-10GBit/second, to ensure network condition is OK

Issue: Slow write performance when measured via dd over the NFS mount from the storage server connected to the virtualization node.

We expected maximum 1-1,2 GB/Second speeds over the dd while measuring on the node, however, we see numbers around 250MB-350MB/Second, which is four times slower than expected.

We tried different storage OS, such as plain Ubuntu, TrueNAS, and OpenMediaVault, and these all give us the same slow results.
We have also reset/changed the Mikrotik router, which is in the middle, however, this did not solve the issue.

What we observe though, is a delay between issuing dd command and network activity over the switch. It looks like it tries to pre-allocate the blocks needed for the write, bringing slow overall results

At this point we are lost and trying to get any help or advise from a community, considering, that the same setup is working at full 1.2 GB/sec on the same configuration of machines, network cards, and a switch, where virtualization servers are using an old kernel 4.15.18-28-pve

Thank you
 
Hi,

did you try a benchmark with fio. I would be worried that the cache of the SSD drives might fill up. I saw some threads about slow performance [3]. The reviews mostly say that if the cache fills up the drives are very slow at writing. For example anandtech tested the drive and the last 16 GBs in their benchmark (after caches filled) were written with 80 MByte/s [1].

When you did your tests did the drives have enough time to clear their caches?


[1] https://www.anandtech.com/show/15887/the-samsung-870-qvo-1tb-4tb-ssd-review-qlc-refreshed/2
[2] https://www.tomshardware.com/reviews/samsung-870-qvo-sata-ssd
[3] https://forum.proxmox.com/threads/samsung-870-qvo-1tb-terrible-write-performance.82026/
 
Hi,

did you try a benchmark with fio. I would be worried that the cache of the SSD drives might fill up. I saw some threads about slow performance [3]. The reviews mostly say that if the cache fills up the drives are very slow at writing. For example anandtech tested the drive and the last 16 GBs in their benchmark (after caches filled) were written with 80 MByte/s [1].

When you did your tests did the drives have enough time to clear their caches?


[1] https://www.anandtech.com/show/15887/the-samsung-870-qvo-1tb-4tb-ssd-review-qlc-refreshed/2
[2] https://www.tomshardware.com/reviews/samsung-870-qvo-sata-ssd
[3] https://forum.proxmox.com/threads/samsung-870-qvo-1tb-terrible-write-performance.82026/
Hello,

Thank you for your reply.
This is not related to SSD.

Here is fio results from storage server:
WRITE: bw=1376MiB/s (1443MB/s), 1376MiB/s-1376MiB/s (1443MB/s-1443MB/s), io=9216MiB (9664MB), run=6697-6697msec

and this is from node to storage with nfs mount:
WRITE: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=7788MiB (8166MB), run=60007-60007msec
 
ok :),

then how does your /etc/pve/storage.cfg look?
dir: local
path /var/lib/vz
content iso,vztmpl,backup

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

nfs: actinium
export /export/actinium
path /mnt/pve/actinium
server 12.12.12.89
content rootdir,images,vztmpl,iso
options rw,tcp,rsize=32768,wsize=32768,hard,intr,noatime,vers=3,async
prune-backups keep-all=1


we have tried also 4096 rwsize (default)
 
I'm not sure what is going on but maybe check which rsize/wsize they are actually using in /proc/mounts.

Do other clients besides Proxmox have such performance issues?
 
I'm not sure what is going on but maybe check which rsize/wsize they are actually using in /proc/mounts.

Do other clients besides Proxmox have such performance issues?
Code:
12.12.12.89:/export/actinium /mnt/pve/actinium nfs rw,noatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=12.12.12.89,mountvers=3,mountport=42721,mountproto=tcp,local_lock=none,addr=12.12.12.89 0 0

we will try to install fresh ubuntu today to test NFS performance
 
  • Like
Reactions: shrdlicka
I'm not sure what is going on but maybe check which rsize/wsize they are actually using in /proc/mounts.

Do other clients besides Proxmox have such performance issues?

Hello, sorry for late response!

We have tried Ubuntu 22.04... same results over NFS
Also we have tried proxmox 5.4. Network speed didn't rise over 2gbps.

Maybe you have some suggestion? Kernel version etc?
 
This may or may not be related, but try installing iperf3 on your openmediavault vm, on the proxmox host, and on one of your client machines.
Run iperf3 -s on your vm, then iperf3 -c vm.ip from the client and see if you are getting high retr. Run the iperf3 -s on your proxmox host and see if you also have high retr.

May not be related, but I am running openmediavault on bare metal and I have high retries when I am sending from a VM. Maybe you are having a variation of the issue I am experiencing?

Those those ssds like that, I wouldn't expect it to be a sync issue when writing over nfs. Maybe try settings sync=disabled briefly to see how the write performance changes, if any?

Also, if you have both a 10G and a 1G nic, make sure you have the correct nic on the VM. I accidentally had mine setup wrong and was wondering why I was topping at 1G on a VM until I realized I had that wrong.
 
I have the same issue. iperf got me about 22gbit/s, I could exclude CPU and disk issues.

fio gets me, 2200mb sequential write on the disk itself and only 150 over nfs.

The NFS Server is Running on Proxmox and the Clients is a VM on it.

The weird thing here is now that my PC that need to go over network, archives 300mb/s with the same mount parameters.

I do use Virtio as the NIC

Using SSHFS instead of NFS has the same results, but IOPS and rnd are worse.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!