Improving performance of RaidZ1

Yuri_AR

New Member
May 27, 2026
4
0
1
Hi! One of the nodes I'm using Proxmox in is experiencing high IOwait from time to time, and I'm very confident this is because physical disk bandwidth limitation.

The storage being affected is a 3 disk raidZ1 pool, all of them 2.5" SSDs (Mix of crucial and Samsung drives). All of them are pooled in a single pool. This storage is being used by around 15 containers and VMs, some of them with occasional bursts of disk use. During these periods of use, and during backups, restores, disk migrations, etc, the IOwait can reach 10-20%, and all the machines using that storage slow down to a crawl.

We were thinking on just buying more disks and expanding the pool, but I'm not convinced this would address the issue. Write capacity would remain the same as far as I know, and I'm not sure there would be an improvement on read speeds given that the data would be read from the same number of shards.

I have 3 free bays on the server, and I have the possibility of adding 3 more disks. What configuration would be optimal for this situation? A different zfs pool would defeat the purpose of it, since I would need to have different PCTs/VMs on different storage, and I'd like to have them all on the same pool. Just adding the disks to the pool would probably not be useful as I said, and creating another 3 disk raidZ1 and somehow mashing both together sounds fairly inefficient.

What are my options, and which is the best one? I've been using proxmox for a while, but I don't have a firm grasp on the nuances of ZFS and storage management, until now I've managed with plain zfs pools with all the disks.
Thanks!
 
Last edited:
What you want for an OS/VM pool is random Read/Write IOPS. Striping your drives will always be faster than mirrors/RAIDZ (speed vs. reliability/parity..) but having drives with good random IOPS makes a much bigger difference.

Practically this means you want SSD's with DRAM cache, acting as a high-speed map for locating data on the slower NAND flash, thereby significantly improving random read/write performance and overall drive responsiveness. PLP is another sort of cache which is not performance minded, but serves as a failsafe buffer that ensures writes to the NAND flash before the drive shuts down.

Enterprise SSD's generally excel in random IOPS, but there are also consumer SSD's out there with DRAM cache that can run modest PVE OS/VM pools perfectly fine.

What disks are you using exactly?
 
Last edited:
  • Like
Reactions: UdoB
raidZ1 is known to not be good for IOPS: https://forum.proxmox.com/threads/fabu-can-i-use-zfs-raidz-for-my-vms.159923/ . And your drives might be consumer drives without PLP. Lots of threads about disappointing (write) performance on this forum. I don't know an easy solution besides buying better and/or additional drives and using a stripe of mirrors (like RAID10) instead.
Yes, they are definitely consumer grade drives
What you want for an OS/VM pool is random Read/Write IOPS. Striping your drives will always be faster than mirrors/RAIDZ (speed vs. reliability/parity..) but having drives with good random IOPS makes a much bigger difference.

Practically this means you want SSD's with DRAM cache, acting as a high-speed map for locating data on the slower NAND flash, thereby significantly improving random read/write performance and overall drive responsiveness. PLP is another sort of cache which is not performance minded, but serves as a failsafe buffer that ensures writes to the NAND flash before the drive shuts down.

Enterprise SSD's generally excel in random IOPS, but there are also consumer SSD's out there with DRAM cache that can run modest PVE OS/VM pools perfectly fine.

What disks are you using exactly?
2xcrucial, 1x samsung pro. We realised it wasn't ideal going in (The samsung pro in the system mirror is already at 50% wearout after a year), but our budget was too limited and had to cut corners somewhere.

Having 3x free bays, what is the optimal move here? I have some ideas but I don't know how to validate them.

- 3x mirrored enterprise grade drives in a new zfs pool, and leave the rest for general storage? (4TB total storage for machines if we use big drives, which may be a bit tight. We'd definitely use over 50% of this capacity).
- 3x enterprise drives in A saparate raidZ1 pool. We'd have more storage, but less performance. I don't know how that performance would be compared to our current raidZ1 pool with normal drives.
- 3x consumer drives in a different configuration? Will it make a difference if we mirror our existing raidZ1 pool, or are we bound to hit another brick wall?

We are OK on capacity, I'm more concerned about performance at the moment.

I'm looking into a ceph cluster for shared storage and a new fiber optic backend for it, but that's a separate topic and we are not getting that proposal approved anytime soon.
 
Which Samsung Pro / Crucial drives exactly and in which capacity?
And what is your current configuration? 2 disk mirror for OS and 3 disk Z1 for VM storage?
 
Last edited:
1x Samsung EVO 870, 4TB
2x Crucial CT4000BX500SSD1, 4TB

All of them 2.5" SSD drives with SATA interface.
 
- 3x mirrored enterprise grade drives in a new zfs pool, and leave the rest for general storage? (4TB total storage for machines if we use big drives, which may be a bit tight. We'd definitely use over 50% of this capacity).
Is this 6 drives (stripe of mirrors)? What could give you twice the write (and the performance increase of PLP on top of that) and up to 4 times the read performance. Or just one 3-way mirror? Which give you better write IOPS only because of the enterprise drives but up to four times the read IOPS (stripe+mirror).
- 3x enterprise drives in A saparate raidZ1 pool. We'd have more storage, but less performance. I don't know how that performance would be compared to our current raidZ1 pool with normal drives.
Don't use raidZ1 for active virtual disks (as mentioned before): https://forum.proxmox.com/threads/fabu-can-i-use-zfs-raidz-for-my-vms.159923/
- 3x consumer drives in a different configuration? Will it make a difference if we mirror our existing raidZ1 pool, or are we bound to hit another brick wall?
A stripe of two raidZ1 could improve performance but it will be very unbalanced between the consumer and the enterprise drives. And did I mention yet that raidZ1 is bad for VMs? ;-)
We are OK on capacity, I'm more concerned about performance at the moment.
Get a mirror of two (large enough) fast enterprise drives and use them for active virtual disks (this can also survive one broken drive like raidZ1). Use the consumer drives for slow storage, which can even be raidZ1).
... and we are not getting that proposal approved anytime soon.
If this is for a company then why use the consumer drives in the first place? Or is this just for trying out Proxmox before using it in production? Maybe search the forum for raidZ1 and PLP to learn from other people's experience with slow storage and how they improved it (with better drives without raidZ1). Maybe you show aim for redundancy of multiple Proxmox nodes in a cluster and less on redundancy of drives of a single node.
 
Last edited:
you can "expand" your new pool after creating.
1) start with a new zfs pool as mirror vdev0 of 2x SSD Disk with PLP
2) Then transfer you data of your old zfs pool to the new one.
then
3) destroy the old zfs pool and get 3 free paces,
4) add 2 more zfs mirror vdev1 and vdev2 of 2x SSD Disk with PLP to the new zfs pool.

read the openzfs manual
 
Fast enterprise drives are obviously preferred and they usually also have a much higher TBW so they wear out much slower. But they come at a price.
It's a trade-off. Invest in enterprise grade hardware which is fast and lasts longer or spend less but more often on cheaper disks. Just make sure that you get DRAM cached SSDs for this purpose.

Regarding your current setup, the Crucial disks are holding you back as they lack DRAM cache and have poor ramdom IOPS. The DRAM cached Samsung is probably waiting for them in your Z1 pool. If 4TB space is enough, the cheapest solution is another 870 EVO in mirror with your current one. Or go for two new DRAM cached SSDs which is easier as you can tranfer your current pool to them without first destroying it. Random IOPS, TBW and pricing are your main comparison points.

Also consider that you are operating within SATA boundaries, not PCIe/NVME. Getting SATA enterprise disks is probably not the best bang for buck.
 
Last edited:
  • Like
Reactions: Kingneutron
Is this 6 drives (stripe of mirrors)? What could give you twice the write (and the performance increase of PLP on top of that) and up to 4 times the read performance. Or just one 3-way mirror? Which give you better write IOPS only because of the enterprise drives but up to three times the read IOPS.
This would be 3x enterprise drives, in some sort of performance-oriented configuration, while the existing 3 are left for slow/bulk storage.

Don't use raidZ1 for active virtual disks (as mentioned before): https://forum.proxmox.com/threads/fabu-can-i-use-zfs-raidz-for-my-vms.159923/
Yeah, it's not ideal, but we needed *some* sort of redundancy. Also, I didn't initially plan to use this storage for active VMs so the setup was initially fine, but the situation has changed recently.

Get two fast enterprise drives and use them for active virtual disks (this can also survive one broken drive like raidZ1). Use the consumer drives for slow storage, which can even be raidZ1).
This may be an option. Is 'don't fill them more than 50%' still a valid recommendation on enterprise SSDs? If I was able to use ~75% of the capacity, maybe we could get away with 2 4TB drives, and use the remaining slot to enlarge the slow storage pool.

If this is for a company then why use the consumer drives in the first place?
Money. We had a ***very*** limited budget for this node, and we chose to prioritize having more RAM and CPU over the quality of the storage. In hindsight it wasn't a bad call given the current circumstances, we were OK addressing this consumer disk situation in the future (VS having to pay 9000 bucks to dell for a 64GB stick of ram)

Or is this just for trying out Proxmox before using it in production? Maybe search the forum for raidZ1 and PLP to learn from other people's experience with slow storage and how they improved it (with better drives without raidZ1). Maybe you show aim for redundancy of multiple Proxmox nodes in a cluster and less on redundancy of drives of a single node.
We definitely use proxmox on production! It works great and I really like it. The thing is, all the common gotchas proxmox has have definitely gotten us at some point, because we never had someone with deep enough knowledge of all the ways it can bite you in the ass. I do my best to research things before choosing and I dedicate the time I can to learning about it, but sometimes we have to learn from our mistakes. :)

Regarding your current setup, the Crucial disks are holding you back as they lack DRAM cache and have poor ramdom IOPS. The DRAM cached Samsung is probably waiting for them in your Z1 pool. If 4TB space is enough, the cheapest solution is another 870 EVO in mirror with your current one. Or go for two new DRAM cached SSDs which is easier as you can tranfer your current pool to them without first destroying it. Random IOPS, TBW and pricing are your main comparison points.
Thank you! Yeah, replacing the crucial drives was definitely in my wishlist

the Crucial MX500 4TB Drive can have DRAM Cache
These drives don't have DRAM. We scraped the bottom of the barrel as much as possible without going to no-brand drives, I'd be surprised if they can reach half their rated transfer speed at all, crucial drives are kinda atrocious to use.
 
It‘s a cost calculation. You can either move critical vdisks to a mirror of 2x4TB enterprise drives or a mirrored stripe of 4x2TB enterprise drives (which gives you the best performance). Using 4 additional drives will result in space problems regarding your rack/case (if I read correct). So you would have to re-arrange your data. For example: backup everything from the RAIDZ1, remove/destroy the pool and setup a new mirror from the 2 Crucial drives (cold storage, etc.).
 
  • Like
Reactions: Kingneutron