Slow fsyncs value with NVMe

luison

Renowned Member
Feb 22, 2010
151
5
83
Spain
elsurexiste.com
After some struggling to get our setup running over two NVMe PCIe x4 units (https://forum.proxmox.com/threads/i-o-errors-with-nvm-drives.64974/#post-294108) and although we are reasonably happy with the system I've just discovered that the performance returned by pveperf /FSYNCS/SECOND are sometimes relatively bad:

Code:
pveperf
CPU BOGOMIPS:      114986.72
REGEX/SECOND:      4106212
HD SIZE:           41.22 GB (/dev/mapper/pve-root)
BUFFERED READS:    2663.07 MB/sec
AVERAGE SEEK TIME: 0.06 ms
FSYNCS/SECOND:     210.66
DNS EXT:           546.76 ms
DNS INT:           614.38 ms (alsur.es)

The values can fluctuate from 180 to 500 from time to time but tend to stay on the lower end.
root is running over a Linux raid0 on top of the two NVMe which is the PV of the PVE VG where root is.

I am assuming the old recommendation of "not using" ext4" is obsolete, considering as I understand the quicker degradation of SSD units in ext3 vs ext4 so our mount is pretty straight forward and as the original installer lefted:

/dev/pve/root / ext4 errors=remount-ro 0 1


The value for a raid0 over the same two NVM units:

Code:
pveperf /tmp/
CPU BOGOMIPS:      114986.72
REGEX/SECOND:      3972371
HD SIZE:           14.70 GB (/dev/mapper/data-tmp)
BUFFERED READS:    2878.57 MB/sec
AVERAGE SEEK TIME: 0.02 ms
FSYNCS/SECOND:     11633.78
DNS EXT:           618.83 ms
DNS INT:           657.43 ms (alsur.es)

which could also be related to the improved mount:
/dev/mapper/data-tmp /tmp ext4 nofail,noatime,data=writeback,barrier=0,errors=continue 0 0


But the most extraordinary is that the same done over a plain standard SSD raid1 (build through LVM instead of mdadm + LVM)
Code:
pveperf /home2/serverxxxx/VIDEO/
CPU BOGOMIPS:      114986.72
REGEX/SECOND:      4086728
HD SIZE:           195.86 GB (/dev/mapper/data-video)
BUFFERED READS:    535.05 MB/sec
AVERAGE SEEK TIME: 0.19 ms
FSYNCS/SECOND:     1077.81
DNS EXT:           551.59 ms
DNS INT:           663.95 ms (alsur.es)

I was hoping to test the behaviour inside of any of the containers (LVM thin) but not sure if that can be done.

Not sure what I am missing or what is misbehaving on my root setup as I was expecting much better results than that on the pve root.


Also for the record on a remote server with 2 standard (x3) NVM units with same mdadm+lvm PVE VG the results are:
Code:
pveperf
CPU BOGOMIPS:      60672.00
REGEX/SECOND:      3943638
HD SIZE:           33.52 GB (/dev/md2)
BUFFERED READS:    407.30 MB/sec
AVERAGE SEEK TIME: 0.09 ms
FSYNCS/SECOND:     10260.04
DNS EXT:           36.70 ms
So yes, I am assuming something not working as expected.
 
Last edited:
Not necessarily, the tests were made in different environments (setups). And it makes a big difference if there is already some load on the storage.
 
Thanks Alwin. Can you clarify that. I mean, the root partition access should be the same for all mount points. All containers are thin provisioned there. A second one with bad results (same raid+lvm behind) has no containers running on it.
I could test with all containers stopped or from a rescue disk if i can execute pvperf direct. Would that give more "real" performance?
 
No. The fsync test through pveperf opens a file and writes data into it. You will always have at least a filesystem in between. With using the mdraid (not supported by Proxmox) and/or LVM another layer is introduced. If you don't know how fast the NVMe can go, the results of the other layers will not matter.

For benchmarks always start at the bottom and add layer by layer. Best use FIO or some other tool designed for storage benchmarks.
https://pve.proxmox.com/wiki/Benchmarking_Storage
 
Thanks. I did use FIO at the time before deciding storage formats and results were better or at least not x5 times slower ever , especially compared to a similar mdadm + lvm over 2 ssds.

I'll try to repeat the tests with FIO and maybe try to adjust on of the two raids that show low figures on pveperf. There are actually constructed as a 2 disk Raid10, something which is possible with mdadm and we have used before without issues. I'll try to redo one of them as a standard raid1 and/or as a raid within LVM and see the differences.

By the way, understanding that PVE has never actively supported Linux Raid I suppose this will remain the same now that the they can be constructed as a simple layer on top of LVM.

thanks.
 
By the way, understanding that PVE has never actively supported Linux Raid I suppose this will remain the same now that the they can be constructed as a simple layer on top of LVM.
Yes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!