[TUTORIAL] Proxmox VE 7.2 Benchmark: aio native, io_uring, and iothreads

good morning everything is fine?

I have a question please, I have disks in passthrough, 2 NVME, 5 HDD, 2 SSD, all on passthrough.

The question is, can these disks be considered for use as async IO as native?

And I have 1 more partition as local (LVM - Proxmox installation) which is on NVME, can I also set Native?

Thanks
 

Attachments

  • Screenshot 2024-05-03 at 09.13.12.png
    Screenshot 2024-05-03 at 09.13.12.png
    210.6 KB · Views: 25
  • Screenshot 2024-05-03 at 09.12.55.png
    Screenshot 2024-05-03 at 09.12.55.png
    266.9 KB · Views: 25
Hey everyone, a common question in the forum and to us is which settings are best for storage performance. We took a comprehensive look at performance on PVE 7.2 (kernel=5.15.53-1-pve) with aio=native, aio=io_uring, and iothreads over several weeks of benchmarking on an AMD EPYC system with 100G networking running in a datacenter environment with moderate to heavy load.

Here's an overview of the findings:
  • iothreads significantly improve performance for most workloads.
  • aio=native and aio=io_uring offer similar performance.
  • aio=native has a slight latency advantage for QD1 workloads.
  • aio=io_uring performance degrades in extreme load conditions.

Here's a link to full analysis with lots of graphs and data |https://kb.blockbridge.com/technote/proxmox-aio-vs-iouring/

tldr: The test data shows a clear and significant performance improvement that supports the use of IOThreads. Performance differences between aio=native and aio=io_uring were less significant. Except for unusual behavior reported in our results for QD=2, aio=native offers slightly better performance (when deployed with an IOThread) and gets our vote for the top pick.

attention: Our recommendation for aio=native applies to unbuffered, O_DIRECT, raw block storage only; the disk cache policy must be set to none. Raw block storage types include iSCSI, NVMe, and CEPH/RBD. For thin-LVM, anything stacked on top of software RAID, and file-based solutions (including NFS and ZFS), aio=io_uring (plus an IOThread) is preferred because aio=native can block in these configurations.

If you find this helpful, please let me know. I’ve got a bit more that I can share in the performance and tuning space. Questions, comments, and corrections are welcome.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
good morning everything is fine?

I have a question please, I have disks in passthrough, 2 NVME, 5 HDD, 2 SSD, all on passthrough.

The question is, can these disks be considered for use as async IO as native?

And I have 1 more partition as local (LVM - Proxmox installation) which is on NVME, can I also set Native?

Thanks
 
Interesting addition here? OP recommends raw combined with iothread, however, I get this on raw storage.

Code:
Block format 'raw' does not support the option 'iothread'
 
>>>
attention: Our recommendation for aio=native applies to unbuffered, O_DIRECT, raw block storage only; the disk cache policy must be set to none. Raw block storage types include iSCSI, NVMe, and CEPH/RBD. For thin-LVM, anything stacked on top of software RAID,
and file-based
solutions (including NFS and ZFS), aio=io_uring (plus an IOThread) is preferred because aio=native can block in these configurations.
 
Last edited:
  • Like
Reactions: chrcoluk
For the benefit of others, if you add iothread normally to a raw device in Proxmox, it wont fail to start, I got the error because I did it manually as -args in the config file, if doing it via Proxmox, it will silently not add the iothread flag on raw devices to prevent the error.

Thanks @_gabriel, I think with the combination of his graphs and that description it wasnt entirely clear to me he meant only use iothread on file based storage.

I have also kept aio=threads on some i/o heavy hdd storage as that benefits from it I guess due to higher queue depth, and I noticed it performs considerably worse with both native and io_uring, it has plenty of spare cpu cycles for it, but will use these recommendations on flash storage and devices with only low i/o.
 
Last edited: