Could replacing ZFS with Hardware Raid fix IO delay issue?

DaGerba

Renowned Member
Feb 5, 2015
17
7
68
Hello,

in our small office I run a standalone Proxmox VE on an older machine from 2020: Asus P11C-M/4L Mainboard, Intel Xeon E-2288G 8-Core 3,7 GHz, 64GB RAM, ZFS Raid Z1 with 3x 4TB WD Red SA500 SSDs on SATA.
The machine runs 3 linux servers and 3-4 Windows guests for homeoffice use and did that very well for some time. But since we canged Win10 to Win11, we have constant IO delay in a range of 1 - 10 % with peaks even higher. And the windows guests often get slow.
I understand that the Nas SSDs on SATA with ZFS are the bottleneck, but budget is low and better SSDs quite expensive these days. So I thought of getting a 6 or 8 port hardware raid controller, 2 additional smaller SSDs in raid 1 for the system and the 3 4TB SSDs in raid 5 for the VMs.

Do you think this change could possibly solve my IO delay issue?

As I am not much in hardware, I did not find a suitable raid controller with at least 6 SATA ports. Any recommenadtions here?

I know that my setup is not great for our use case, but it did perfectly well with Win10 guests, so I am hoping to keep it running for a while, until hardware prices hopefullyget back to normal.

Thanks for your thoughts!

Greetings, Michael
 
A hardware RAID controller Battery Backup Unit (BBU) can cache sync writes and that might indeed help. (Or replace the SSDs with ones that have PLP.)
ZFS RAIDz1 is a poor choice for VM as the IOPS are much lower then using a ZFS stripe of mirrors. Maybe you should redo your ZFS pool for better performance?
Combining ZFS with hardware RAID is warned against. Maybe not using ZFS also fixes your performance issue?
 
Last edited:
Last edited:
Ah, that was a quick one! Thanky you.

Redoing the ZFS pool sounds worth a shot. So I would still get 2 additional smaller SSDs for the system on a ZFS Raid-1 and then use the 3 4TB SSDs for a "ZFS stripe of mirrors"? Could you please clarify the "ZFS stripe of mirrors"? I am not sure how to do that the best way. Sorry, if this is a stupid question, I am part time admin against my will and not experienced with ZFS.
 
Last edited:
Ah, that was a quick one! Thanky you.

Redoing the ZFS pool sounds worth a shot. So I would still get 2 additional smaller SSDs for the system on a ZFS Raid-1 and then use the 3 4TB SSDs for a "ZFS stripe of mirrors"? Could you please clarify the "ZFS stripe of mirrors"? I am not sure how to do that the best way. Sorry, if this is a stupid question, I am part time admin against my will and not experienced with ZFS.

Nope, you won't buy the two small SSDs, and you will need at least one other 4TB SSD since a striped mirror will need at least four drives.
It's the ZFS equivalent of RAID10, basically you build two mirrors, then stripe then together. This gives great performance at the cost of capacity See https://www.45drives.com/community/articles/RAID-and-RAIDZ/ for a further explaination and
https://www.45drives.com/zfs/zfs-calculator/ for playing around with different values.

A triple mirror wouldn't need another SSD and also have better performance than RAIDZ1, but will only have the capacity of one SSD. The striped mirror on the other hand will have the capacity of two SSDs.

Udo explains this and other things in the topic on his writeup
 
you will need at least one other 4TB SSD since a striped mirror will need at least four drives.
It's the ZFS equivalent of RAID10, basically you build two mirrors, then stripe then together. This gives great performance at the cost of capacity See https://www.45drives.com/community/articles/RAID-and-RAIDZ/ for a further explaination and
https://www.45drives.com/zfs/zfs-calculator/ for playing around with different values.
Thank you for clarification! I think I got it now.

One last follow-up question to
Nope, you won't buy the two small SSDs,
So you say that the 2x2 striped mirror, like in UdoBs example, will be performant enough that I can save the split of system and VMs?
 
So you say that the 2x2 striped mirror, like in UdoBs example, will be performant enough that I can save the split of system and VMs?

Whether you do a split or not is basically a matter of preference and constraints. For example my homelab has two mini-pcs which only have two storage slots (one NVME, one SATA). To split system and vms I would have to sacrifice redunancy. I didn't wanted to do that so I devicided to go with one pool (out of the NVME and SATA-SSD) for system and vms. Another person might have prefered to seperate them and relie soley on backups (I have backups too but they are for the case the redundancy wasn't enough, RAID is NOT backup and backups is NOT RAID. Basically having backup is more important )

On the other hand if you have enough slots (and budget) seperating OS and VM/lxcs is a good idea since then you can reinstall the os without needing to restore the vm/lxc data (as long as you saved the vm/lxc configuration before reinstall).

Regarding performance: I don't know whether the seperation would help or not, this is propably something where a benchmark might make sense.

I'm sure however that any mirrored setup (be it a striped mirror or regular mirror/triple mirror) will perform better than RAIDZ1.

My suggestion to buy a 4TB SSD instead of the storage controller+small ssds was mostly since I got the impression that you are operating on a tight budget.
Thus it might make sense to invest in another 4TB-SSD instead of the controller+os ssds.
But if you would still prefer to seperate OS and VM/LXCs you will of course still need the small ssds. With ZFS you could still get away without the storage controller.
 
Last edited:
  • Like
Reactions: UdoB and DaGerba
Ok, decision made, one more 4TB WD Red SA500 ordered. Paid about the same amount for it as I did in May 25 for three of them :mad:.

I checked the installation process on a nested test system. It offers to build a ZFS Raid10 to install the system on. The result looks quite similar to the proposal of @UdoB to me:
ksnip_20260211-102205.png
So there is no need to build two mirrors and stripe them together by foot, right?

Finaly I have got one, rather rhetorical question left: I guess there is no good way to change the RAIDz1 to RAID10 on a running system, like converting it to a 2-disk mirror and adding a second mirror? I am only thinking of that to save downtime while backing up the VMs to another disk and restoring them after installation. Imagine, that some ZFS woodoo, if it exists, could be much quicker and save my weekend. Currently only about 1,5 TB of the RAIDz1 with 3x 4TB is used.
 
Last edited:
  • Like
Reactions: Johannes S
So there is no need to build two mirrors and stripe them together by foot, right?
No - the result you've shown is exactly this: two mirrors, striped :-)

change the RAIDz1 to RAID10
No, that's not possible. You will have to create a new pool with a single mirror first. (Well, technically a single drive is enough - you can add a second drive to create the first mirror later on. This approach is not recommended, of course.)

Incomplete steps, just an idea of the sequence: you can "zfs send" the data from the old RaidZ1 pool to the new pool - in the background, during normal operation. When that has finished you stop all consumers, "zfs send" a last snapshot and export (or destroy) the old pool. Now you can export the new pool and import it with the name of the old one.

There are pitfalls in this process - especially if we are talking about the "rpool", which is used to boot from! E.g. the "export old pool" won't be possible if it is used by the operation system. And to make the new pool bootable some extra steps are required too.

Personally I would test the whole procedure first! I have a virtual(!) cluster for dangerous things like that in my Homelab. In my $dayjob I have some eol'ed servers for this.

Oh... and please do backup first, before destroying things ;-)
 
  • Like
Reactions: Johannes S
Thank you @UdoB for the reply. That is what I estimated. As I am booting from the rpool I will go the more secure "long-night-way" with a fresh installation on the new ZFS RAID10 and then restoring the VMs from my backups.
Oh... and please do backup first, before destroying things ;-)
Good advice. As I am working on quite old and less suited hardware, backup is a topic I allways have an eye on.

I will post a little feedback how it went. Thank you all so far!
 
  • Like
Reactions: Johannes S and UdoB
On saturday the fourth 4TB SSD arrived and I made a fresh install on a ZFS RAID10. Although the effect is not as good as I was hoping the system is running quite ok and working on the Win11 VMs is acceptable. IO delay is now below 1% most of the time with some peaks up to 5%. Seems ok for now.

What keeps my head spinning is restoring VMs from backups. I got the ZSTD compressed backups on a 4TB spinning disk with ext4 mounted as directory. When restoring a guest backup to the rpool the IO delay increases to 15-20% as expected. But all machines running on the rpool become so slow they are completely unusable. Even the PVE-console vnc connection sometimes stops working. Luckily I don't need that regularly, but it still makes me worried.
 
  • Like
Reactions: Johannes S