Hello everyone,
First I want to say, I know this question gets asked a lot. I apologize for asking a similar question again. I've spent a little time trying to read through a lot of the posts that I could find, but honestly I just don't have the time to translate others specific set ups to mine. And I'm failing at finding any generalized documentation to help me decide the route to go. Just being frank, I don't have time to learn all the intricacies of storage configurations and hoping someone can just help me with straightforward simple answers.
I'm configuring a new Hypervisor server for mostly homelab use, but it's kind of an elevated home lab that will be running some personal use servers and possibly a few friends and family servers. So not strictly a playground. Performance is an important factor for me on some things. Up until now I've been just playing around testing proxmox vs esxi deciding which way I wanted to go. I'm leaning now toward Proxmox because of the built in support for ZFS as well as because the server I'm using is an old R730 and ESXi has deprecated the CPUs in it. So upgrades later might become problematic and I feel it would be better to go ahead and go with Proxmox for hopefully better / longer support for the hardware. I don't want to get invested in ESXi now and in a year or two need to upgrade for some reason and can't because the CPU is fully removed.
So, all that said, my hardware, R730, 768 GB ram, 2 x E5-2690 xeon processors, I also have a Tesla P40 in the server for vGPU, as well as a 4060 Ti for single pass through that will be used for possibly some gaming, possibly some AI training. Additionally I might put in a Tesla P4 or an RTX A2000 if needed for additional pass through or something. I also might get a couple of dual NVME PCIE cards if they would make a big enough difference on performance. Currently have 13 x 4TB 2.5 inch SSDs in the server, and may get 3 more to fill it out to 16. The boot device is a single 500 GB SSD.
In my testing I've definitely been hitting some kind of IOP bottle neck when just running on a the Single SSD. Multiple VMs doing thing simultaneously have hangs and poor performance even when the CPU is < 2% overall usage. I haven't figured out how to verify it's a IOP problem, but it's the only thing that makes sense as if just one VM is running or only one actively doing stuff it runs beautifully.
So, I've been considering the options... if I just stayed with my 13 current drives should I run 2 x 6 vdevs in RaidZ2? I will say I don't really want to waste 6 drives worth of storage to do a full on mirror'd vdev which I think I gather would give the best performance. If I understand correctly if I did 2 vdevs striped then I should get the IOPS of 2 disks? I'm worried that wont be enough though if I'm hitting some kind of bottle neck with just a single disk already =/ Maybe I should get the 3 more 4tb drives and run a 4 x 4 raidz1? But I'm pretty sure I read somewhere that performance is lessened when running an even number of disks on raidz1 =/ so would 4 x 4 rz1 or 3 x 5 rz1 have better performance there?
Then there is the possibility of the NVMEs. With the main array being SSDs, would there be a noticeable improvement if I had mirrored NVMEs as cache? I could put up to 4 x 2 TB NVMEs in there. Although that seems a little excessive as far as capacity for just caching? This is part where I'm still really unsure / inexperienced.
For the VMs I'll be running, there will probably be 10ish Linux VMs doing various tasks, web servers, email servers, game servers (minecraft, 7 days to die), etc. There will also be a couple of windows VMs, possibly a Windows Server VM. At least one Windows VM that would be used for a cloud gaming server for myself when I'm traveling for work and such so that I don't have to lug around a big gaming laptop hehe. Probably another few Window VMs that would serve as cloud workstations. I do contract work and generally set up a new environment for each contract to keep everything separated. I may have up to 3 or 4 contracts going at once. I know this is kind of a mix of business + homelab, but I don't make enough to go full on enterprise set up, so I'm trying to make do with a kind of hybrid system here.
All the VMs collectively might use up 8 to 10 TB of storage. And I'd like to be able to have some to use as network shares. Plus knowing that SSD performance tanks the more full it is, would like to make sure I have some overhead for that. So I'm thinking I want the total array usable size to be at least in the 20s if not 30 TB range.
Also just to note, the server has a dual 10 GB uplink to the local network, so I don't think network bandwidth will be an issue for local traffic. Dual 1 Gb upstream for remote connections (yeah I'm a crazy person that has 2 ISPs at home, I have a firewall appliance that handles load balancing of the two WANs).
If I missed any important details, please let me know.
I'd really appreciate anyone knowledgeable that would be able to just tell me the key points in configuration for the storage to get the best performance with the hardware I have to work with. I think I want to go with ZFS but I'm not completely sold on it if something else would be substantially better. I want some redundancy on the server, mostly to minimize any downtime in the event of a failure. I don't need a ton in the server itself as I do have another on site backup solution as well as an offsite backup solution already in place (each with their own raid redundancy).
Thanks for any advice!
First I want to say, I know this question gets asked a lot. I apologize for asking a similar question again. I've spent a little time trying to read through a lot of the posts that I could find, but honestly I just don't have the time to translate others specific set ups to mine. And I'm failing at finding any generalized documentation to help me decide the route to go. Just being frank, I don't have time to learn all the intricacies of storage configurations and hoping someone can just help me with straightforward simple answers.
I'm configuring a new Hypervisor server for mostly homelab use, but it's kind of an elevated home lab that will be running some personal use servers and possibly a few friends and family servers. So not strictly a playground. Performance is an important factor for me on some things. Up until now I've been just playing around testing proxmox vs esxi deciding which way I wanted to go. I'm leaning now toward Proxmox because of the built in support for ZFS as well as because the server I'm using is an old R730 and ESXi has deprecated the CPUs in it. So upgrades later might become problematic and I feel it would be better to go ahead and go with Proxmox for hopefully better / longer support for the hardware. I don't want to get invested in ESXi now and in a year or two need to upgrade for some reason and can't because the CPU is fully removed.
So, all that said, my hardware, R730, 768 GB ram, 2 x E5-2690 xeon processors, I also have a Tesla P40 in the server for vGPU, as well as a 4060 Ti for single pass through that will be used for possibly some gaming, possibly some AI training. Additionally I might put in a Tesla P4 or an RTX A2000 if needed for additional pass through or something. I also might get a couple of dual NVME PCIE cards if they would make a big enough difference on performance. Currently have 13 x 4TB 2.5 inch SSDs in the server, and may get 3 more to fill it out to 16. The boot device is a single 500 GB SSD.
In my testing I've definitely been hitting some kind of IOP bottle neck when just running on a the Single SSD. Multiple VMs doing thing simultaneously have hangs and poor performance even when the CPU is < 2% overall usage. I haven't figured out how to verify it's a IOP problem, but it's the only thing that makes sense as if just one VM is running or only one actively doing stuff it runs beautifully.
So, I've been considering the options... if I just stayed with my 13 current drives should I run 2 x 6 vdevs in RaidZ2? I will say I don't really want to waste 6 drives worth of storage to do a full on mirror'd vdev which I think I gather would give the best performance. If I understand correctly if I did 2 vdevs striped then I should get the IOPS of 2 disks? I'm worried that wont be enough though if I'm hitting some kind of bottle neck with just a single disk already =/ Maybe I should get the 3 more 4tb drives and run a 4 x 4 raidz1? But I'm pretty sure I read somewhere that performance is lessened when running an even number of disks on raidz1 =/ so would 4 x 4 rz1 or 3 x 5 rz1 have better performance there?
Then there is the possibility of the NVMEs. With the main array being SSDs, would there be a noticeable improvement if I had mirrored NVMEs as cache? I could put up to 4 x 2 TB NVMEs in there. Although that seems a little excessive as far as capacity for just caching? This is part where I'm still really unsure / inexperienced.
For the VMs I'll be running, there will probably be 10ish Linux VMs doing various tasks, web servers, email servers, game servers (minecraft, 7 days to die), etc. There will also be a couple of windows VMs, possibly a Windows Server VM. At least one Windows VM that would be used for a cloud gaming server for myself when I'm traveling for work and such so that I don't have to lug around a big gaming laptop hehe. Probably another few Window VMs that would serve as cloud workstations. I do contract work and generally set up a new environment for each contract to keep everything separated. I may have up to 3 or 4 contracts going at once. I know this is kind of a mix of business + homelab, but I don't make enough to go full on enterprise set up, so I'm trying to make do with a kind of hybrid system here.
All the VMs collectively might use up 8 to 10 TB of storage. And I'd like to be able to have some to use as network shares. Plus knowing that SSD performance tanks the more full it is, would like to make sure I have some overhead for that. So I'm thinking I want the total array usable size to be at least in the 20s if not 30 TB range.
Also just to note, the server has a dual 10 GB uplink to the local network, so I don't think network bandwidth will be an issue for local traffic. Dual 1 Gb upstream for remote connections (yeah I'm a crazy person that has 2 ISPs at home, I have a firewall appliance that handles load balancing of the two WANs).
If I missed any important details, please let me know.
I'd really appreciate anyone knowledgeable that would be able to just tell me the key points in configuration for the storage to get the best performance with the hardware I have to work with. I think I want to go with ZFS but I'm not completely sold on it if something else would be substantially better. I want some redundancy on the server, mostly to minimize any downtime in the event of a failure. I don't need a ton in the server itself as I do have another on site backup solution as well as an offsite backup solution already in place (each with their own raid redundancy).
Thanks for any advice!
Last edited: