Best ZFS Configuration for 15 Drives

LunarMagic

Member
Mar 14, 2024
43
4
8
So i'm doing a new Proxmox install and I'm trying to figure out how I should do it. I have 15 12 TB Hard Drives with a 64 core EPYC CPU and 1 TB of RAM.

I've seen some people say for this I should run in a single Raidz3 that way I have 3 drives that could go out and be fine and at the same time get the performance and storage of all of them together.

Others were saying to separate them and have 2 different Raidz2 configurations.

I'm not sure what is best but all of the storage would be used for Virtual Machines. At least 200 of them
 
I'm not sure what is best but all of the storage would be used for Virtual Machines. At least 200 of them
VMs typically need a lot of (random small read/write) IOPS, which a single raidz does not provide. Maybe a stripe of 3 raidz of 5 disks? Or a 5-stripe of 3-way mirrors? Or sell them for enterprise SSD's? Maybe test the various configurations before installing Proxmox or taking the VMs into production?
 
  • Like
Reactions: Kingneutron
So you are thinking 3 sets of RAIDz1?

Also to note I will be having this in a HA 3 cluster using CEPH.

I definitely do agree more IOPS the better, but the pricing of SSDs is just too much compared to the HDDs I have
 
So you are thinking 3 sets of RAIDz1?
I'm not sure, I don't have the personal experience but you'll might need several (or many) raidz(1 or 2 or 3) combined from some good IOPS. I think you can find some similar threads on the forum for some experience.
Also to note I will be having this in a HA 3 cluster using CEPH.
Ceph needs at least 3, so that does not give you any redundancy. Do you have 3x15 HDDs? Or only 5 HDDs per node?
I definitely do agree more IOPS the better, but the pricing of SSDs is just too much compared to the HDDs I have
I fear that HDDs will make you(r VM users) happy. But maybe test some configurations to (prove me wrong and) be sure?

EDIT: A physical machine with 1 HDD feels slow. 200 VMs at the same time on one big redundant drive (1 raidz) will be much worse. 200 on 15 is still 13 VMs worth of IOPS per drive. But then again, I don't have the experience or performance numbers to back up my fear.
 
Last edited:
So its 3 separate machines I have, I'm talking about just one of them. My 3 machines each have 10+ drives on them so I was going to do what you said for 1 of them for the rest.

I'll look up the different in IOPS for doing Z1 vs Z2.

The ceph question each separate computer in the node will have around the same total storage.

With the last part, I currently have 1 single machine running ESXI (I'm moving to proxmox to get away from it) that is running 15 10 TB SAS Hard drives in a RAID 60 configuration. I'm not saying it runs them great, but they are all Ubuntu and its hasn't been too intense on the 200 VMs I've been running currently.
 
I'll look up the different in IOPS for doing Z1 vs Z2.
I don't think there is one. Read/write/IOPS/bandwidth differences are between stripes, mirrors and raidz. Make sure to read up on how much performance and usable space you get (for various blocksizes) with raidz (it's also on this forum somewhere).
The ceph question each separate computer in the node will have around the same total storage.
Please note that if one node goes down, Ceph panics and might not be usable. Finding the right Ceph configuration (of which I also have no experience) is probably also something you'll need to find someone's experience and suggestions for.
With the last part, I currently have 1 single machine running ESXI (I'm moving to proxmox to get away from it) that is running 15 10 TB SAS Hard drives in a RAID 60 configuration. I'm not saying it runs them great, but they are all Ubuntu and its hasn't been too intense on the 200 VMs I've been running currently.
RAID60 sounds like it would be most like a stripe of two raidz2.

EDIT: If you're going for Ceph, then the raidz question might be irrelevant: https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster
 
Last edited:
Oh interesting. I didn't realize it just manages it and I don't have to put it into any kind of raid. That's pretty cool! So based on this it looks like the only thing I need left is 10 GBs switches exclusively for the machines to communicate on. Is that correct?
I don't think there is one. Read/write/IOPS/bandwidth differences are between stripes, mirrors and raidz. Make sure to read up on how much performance and usable space you get (for various blocksizes) with raidz (it's also on this forum somewhere).

Please note that if one node goes down, Ceph panics and might not be usable. Finding the right Ceph configuration (of which I also have no experience) is probably also something you'll need to find someone's experience and suggestions for.

RAID60 sounds like it would be most like a stripe of two raidz2.

EDIT: If you're going for Ceph, then the raidz question might be irrelevant: https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!