Proxmox ZFS Performance Question - General Structure, PCI Passthrough and NVME SSDs

drdenis

New Member
May 20, 2024
7
1
3
I am currently planning to migrate to Proxmox, but would like to find out some important things before I actually take the step. Before I ask my questions, I think it is best to describe my setup/requirements:

I don't run many VMs, but a few with high IOPS and consistency requirements, as I run a Dockerized PosgreSQL database alongside some web servers, etc. For consistency (and easy backup via ZFS replication) and the fact that I already know a bit about ZFS, I would like to go with ZFS as my file system. However, the NVME SSD performance of ZFS is not known to be the best, which could be a problem for my low latency/high IOPS hungry PostgreSQL database.

1) How good is the latest Proxmox in terms of out-of-the-box ZFS NVME performance?

2) Is there a significant performance difference between using the Proxmox local root pool (ZFS mirror) vs. passing two NVME SSDs directly to a VM and creating a ZFS mirror there? Of course, if other VMs are using the Proxmox installation pool at the same time as my PostgreSQL database VM, then performance needs to be shared. However, since I only have a few VMs running and only one with a high IOPS requirement, it seems to me that the performance should be similar. So my question is about the overall performance difference between the Proxmox ZFS root pool and pass-through NVME SSDs for a given VM.

3) Is it possible to easily scale the ZFS root pool for capacity and performance? I know a little about ZFS, and adding additional mirrored vdevs to an existing ZFS pool should work fine. However, I am a Proxmox newbie, so I would like to know if such a setup is generally possible with Proxmox. My plan is to configure the Proxmox ZFS root pool as a ZFS mirror and use it for my PostgreSQL VM storage. Later, I would like to double the write performance by adding a second mirrored vdev. My understanding is that a ZFS mirror has single-disk write speed and double-disk read speed. Adding a second mirrored vdev should double the write performance because now two vdevs can share the write load. Is this correct and does it work with Proxmox?

4) Are there any other ZFS performance tweaks I should make to tune Proxmox for my workload of a Dockerized environment with PostgreSQL and NVME SSDs for ZFS?

So far this are my ideas:

# ZFS Settings
- Mirrored VDEVs
- Set ashift=12 (are SSDs better with ashift=13?)
- Set recordsize and volblocksize to 16k
- Set compress=lz4
- Set atime=off

# VM Settings
- SCSI Controller: VirtIO SCSI Single
- Enable iothread
- Enable Discard for VM and within VM


Regarding recordzise and volblocksize I am not sure. I know that PostgreSQL works best with 16k size. However, this is plain Postgres and not Dockerized PostgreSQL. Also, it seems best to match the record size to the VM size, which should be handled by volblocksize. However, I would appreciate further explanation and thoughts on this topic.

Are there any other settings I should be aware of?

5) What volume type should I use? qcow2 does not seem ideal as it is a CoW system which is already handled by ZFS.

6) What performance can I expect with typical PCIe 4.0 x4 NVME SSDs in a mirrored VDEV pool?

7) What about backup and high availability? I understand that these are two different things. However, as I understand it, it should be possible to replicate data to a second node in my datacenter and restart the VM if node 1 fails. However, this would mean that I would lose the data between the last ZFS snapshot. Is this correct? Is there a better solution? If the benefit is really worth it, I would even consider abandoning ZFS for another alternative.

Thanks in advance!
 
  • Like
Reactions: VinnyG
> For consistency (and easy backup via ZFS replication) and the fact that I already know a bit about ZFS, I would like to go with ZFS as my file system. However, the NVME SSD performance of ZFS is not known to be the best, which could be a problem for my low latency/high IOPS hungry PostgreSQL database

Pretty sure there are guides out there on tuning ZFS for postgres. If speed is not sufficient with zfs, you might want to put it on lvm-thin.

> Is it possible to easily scale the ZFS root pool for capacity and performance? I know a little about ZFS, and adding additional mirrored vdevs to an existing ZFS pool should work fine

Yes, you can start with a 2-disk mirror pool and expand later on with +2 disks of the same size (or larger) pretty easily. Probably have to do it at commandline, proxmox GUI is lacking in some ways.

> a ZFS mirror has single-disk write speed and double-disk read speed. Adding a second mirrored vdev should double the write performance because now two vdevs can share the write load. Is this correct and does it work with Proxmox?

More or less. Testing and probably tuning is required. Proxmox is basically built on top of Debian, so should be doable.

If this is your first time trying this kind of setup, and it's not just for homelab, you may want to consider hiring a consultant to help with design and implementation - and buy a support subscription.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!