[SOLVED] Proxmox and NVMe/TCP

JesperAP

New Member
Jun 18, 2024
13
0
1
Hello,

We recently got a NetApp AFF-A250 and we want to test NVMe over TCP with proxmox.
We do have NVMe/TCP working on VMware and in a windows environment it gives us 33k IOPs with NVMe/TCP.

We got NVMe/TCP working following this tutorial:
https://linbit.com/blog/configuring-highly-available-nvme-of-attached-storage-in-proxmox-ve/

But when testing the IOPs we only get around 16k IOPs. The network speed is 20Gbps so that should not be the issue (tested with iperf3).

Are there settings we need to finetune in proxmox for this to work better? How can I troubleshoot this?
 
I did some more testing, i've installed the nvme storage on multiple nodes (one with SSD, one with HDD) but result stays the same (14/15k IOPS).

I am testing the IOPS with IOmeter and using the same test settings on VMware.
 
Hello @JesperAP,

The level of performance is disappointing; even 30K IOPS is poor.

I advise debugging performance using `fio` on the bare metal host. Use the raw device paths to ensure your networking and storage work as expected. Double-check that you don't have any MTU issues.

For hypervisor-level tuning I invite you to look at https://kb.blockbridge.com/technote/proxmox-tuning-low-latency-storage/index.html#tuning-procedure

If everything works as expected, this technote/series on optimizing Windows for Proxmox should be helpful: https://kb.blockbridge.com/technote/proxmox-optimizing-windows-server/part-1.html


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Hello @JesperAP,

The level of performance is disappointing; even 30K IOPS is poor.

I advise debugging performance using `fio` on the bare metal host. Use the raw device paths to ensure your networking and storage work as expected. Double-check that you don't have any MTU issues.

For hypervisor-level tuning I invite you to look at https://kb.blockbridge.com/technote/proxmox-tuning-low-latency-storage/index.html#tuning-procedure

If everything works as expected, this technote/series on optimizing Windows for Proxmox should be helpful: https://kb.blockbridge.com/technote/proxmox-optimizing-windows-server/part-1.html


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Hello @bbgeek17

I did a test using 'fio'

This is the result:
1718738925859.png

with this file:
1718738951106.png

I know the NetApp Storage VM is capped at 50k IOPS.

Should I be looking in Proxmox to use the right settings or should I finetune windows?
 
Your fio benchmark uses num_jobs=16. This equates to 16 threads of execution, each submitting I/Os and processing completions. This is not how QEMU/Virtio/AIO works in the real world. So, its not an accurate baseline for performance of a single VM.

A few questions:
- What do you get with num_jobs=1 and iodepth=256 (i.e., an equivalent logical queue depth)?
- Could you provide the specific AIO settings for your VM, such as iouring, native, iothreads, or any other relevant settings? VM config would be helpful (Please use TEXT encoded with CODE tags).
- What virtual storage controller are you using?
- Are you using the virtio drivers in Windows?

I know the NetApp Storage VM is capped at 50k IOPS.
Is this a hardware or administrative limit?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Your fio benchmark uses num_jobs=16. This equates to 16 threads of execution, each submitting I/Os and processing completions. This is not how QEMU/Virtio/AIO works in the real world. So, its not an accurate baseline for performance of a single VM.

A few questions:
- What do you get with num_jobs=1 and iodepth=256 (i.e., an equivalent logical queue depth)?
- Could you provide the specific AIO settings for your VM, such as iouring, native, iothreads, or any other relevant settings? VM config would be helpful (Please use TEXT encoded with CODE tags).
- What virtual storage controller are you using?
- Are you using the virtio drivers in Windows?


Is this a hardware or administrative limit?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
I did a test again with the settings changed you said, same results.

I solved the problem myself, I had the windows disks on sata instead of scsi... Now they get around 45k IOPS using IOmeter.


It is a administrative limit I can change it myself
1718779088653.png

Thanks for the help.
 
Our storage does not need LVM, filesystems, or other layers on top, so the caching modes don't apply to our solution (i.e., we pass through raw NVMe devices). In general, we advise avoiding QEMU caching modes in a shared storage situation.
We also advise against it in all cases where your data is valuable.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Deepen Dhulla
Like the previous speaker, I always recommend without a cache.
With a all-flash storage, please enable discard and SSD emulation.
 
  • Like
Reactions: Deepen Dhulla

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!