Advice Regarding ProxMox Cluster Hardware Setup

HomelabHobbyistSK

New Member
Oct 4, 2021
6
2
3
Canada
Hello Everyone,

For quite some time I have been looking into building myself a server cluster to run my Virtual Environment that will host various types of services both internally within my homelab and some that are externally accessible. I have been looking into using ProxMox to achieve this and have done quite a bit of research into the disk/network requirements needed to setup a proper redundant Ceph cluster. Though I understand how everything works, I just want to make sure I have the right idea in terms of hardware, as I will likely need to purchase some drives for these servers while also trying to keep costs as low as possible.

Currently, I have three empty Dell R710's (Gen 2 - 3.5" Drive Version) running dual Intel L5640's and 64GB of memory. All of the nodes will have a Dell PERC H200 HBA card to help manage the drives, and dual 10GB networking ports to go with the original four Ethernet ports on the back. The areas I am somewhat unsure of come with how the storage and networking will need to be setup for the cluster. Below is my proposed setup, but I know it may not be perfect o_O.

Storage:
  • OS: Dual SSD's in ZFS RAID 1 for redundancy
    • Though enterprise SSD's are likely the best option, I was just curious if there would be any significant impact if I chose to use a pair of consumer SSD's instead? I know ZFS can take a toll of the disks in terms of how much data will be written to them, but if they are just handling the OS will it cause significant degrading of the consumer ssd's?
    • VM's will not be ran from these disks at all but will get their own separate OSD
  • Journal: 100GB Intel DC S3700
    • During my research I found that a lot of people recommended an enterprise SSD for the Journal due to the sheer amount of use it will experience
  • OSD's: 1TB SSD to store all VM data and 2x 2TB WD Black 7.2k HDD's for additional storage
    • If a consumer grade SSD is used as an OSD, will it still experience burnout (similar to that I am afraid of for the OS SSD's)?
    • Any issues with using HDD's over SAS?
Networking:
  • Ethernet Port 1: Management Port on Management VLAN
  • Ethernet Port 2: Corosync Ring 1 (Connected to Switch #1 to act as a hearbeat)
  • Ethernet Port 3: Corosync Ring 2 (Connected to Switch #2 to act as a backup hearbear)
  • Ethernet Port 4: Not in Use
  • 10GB Port 1: Proxmox Private Network (Cluster Replication)
  • 10GB Port 2: Proxmox Public Network (LAN access)
  • Management RJ45: iLo Access

I know this post is kind of chunky/long, but hopefully I am understanding everything correctly. Please feel free to correct me or pick apart anywhere I may be misunderstanding anything, as I know my proposed solution may not be "perfect". Slowly just trying to understand everything a little bit better for I dive in

Any feedback is greatly appreciated!

Thanks for taking the time to read my post :)
 
Regarding consumer SSDs for the OS: rather opt for DC SSDs especially if you run Ceph on it as it will create a lot of logs that will be written to them and that might reduce the lifespan of consumer SSDs considerably. If you are at it, try to mix different manufacturers or at least different batches of the same model. In hopes that they will not all fail within a very short time.

OSDs: Consider the failure domains.

If the SSD for the OSD WAL/DB device fails, all OSDs in that node will fail and you will be in a reduced state as only 2 of the 3 replicas are accessible. Once the SSD has been replaced and the OSDs on that node recreated, Ceph will create the 3rd replicas on them again.

If one of the 2 HDDs for the OSDs fails, you still have OSDs on that node available and Ceph will try to recreate the 3rd replicas that were on the failed HDD on the remaining one to get back to having all 3rd replicas on the nodes. Therefore, it is likely that this OSD will get very, if not completely, full. It would be better if you add more OSDs per node that might be a bit smaller so that you can also handle the failure of a single OSD well. This is something that is of less concern if you have more than 3 nodes because the data can also be spread over the nodes. But in the special case of a 3 node cluster, this can be a problem.

Network:
The 2 separate networks for the corosync links is good, especially important if you plan to use the HA stack of Proxmox VE :)
10GB Port 1: Proxmox Private Network (Cluster Replication)
Do you mean that this network will be used for Ceph alone? Then it looks good.

For the network, you will also have to think about what can go wrong and what you want to catch. Especially regarding having two (stacked) switches and using bonds to combine network interfaces for redundancy. But considering that his is a homelab where uptime might not be too important and you are on a budget, having a single switch could also be considered an okayish solution.
 
Last edited:
Regarding consumer SSDs for the OS: rather opt for DC SSDs especially if you run Ceph on it as it will create a lot of logs that will be written to them and that might reduce the lifespan of consumer SSDs considerably. If you are at it, try to mix different manufacturers or at least different batches of the same model. In hopes that they will not all fail within a very short time.

OSDs: Consider the failure domains.

If the SSD for the OSD WAL/DB device fails, all OSDs in that node will fail and you will be in a reduced state as only 2 of the 3 replicas are accessible. Once the SSD has been replaced and the OSDs on that node recreated, Ceph will create the 3rd replicas on them again.

If one of the 2 HDDs for the OSDs fails, you still have OSDs on that node available and Ceph will try to recreate the 3rd replicas that were on the failed HDD on the remaining one to get back to having all 3rd replicas on the nodes. Therefore, it is likely that this OSD will get very, if not completely, full. It would be better if you add more OSDs per node that might be a bit smaller so that you can also handle the failure of a single OSD well. This is something that is of less concern if you have more than 3 nodes because the data can also be spread over the nodes. But in the special case of a 3 node cluster, this can be a problem.

Network:
The 2 separate networks for the corosync links is good, especially important if you plan to use the HA stack of Proxmox VE :)

Do you mean that this network will be used for Ceph alone? Then it looks good.

For the network, you will also have to think about what can go wrong and what you want to catch. Especially regarding having two (stacked) switches and using bonds to combine network interfaces for redundancy. But considering that his is a homelab where uptime might not be too important and you are on a budget, having a single switch could also be considered an okayish solution.
Thank you so much for your response!

I have decided to go with a few Intel S3500's for my OS drives, along with a 100GB Intel S3700 as my Journal. For the HDD's, I am not too worried if one of them dies, as I have plenty of spare 2TB drives that I can quickly swap it out with. Out of the six I plan to use for this build, I will have another four to use as replacements.

And though my two switches aren't officially "stacked", one is just used for all the small appliances in my house (WAPs, TV's, cameras, etc), while the other is my fancy "Server" switch. Both have redundant connections to my firewall, so if I lose one I thankfully wont have the entire network collapse :)

With all your helpful info I think I am ready to pull the trigger and start building/testing my first node. Thank you so much again!
 
  • Like
Reactions: aaron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!