Hi everyone,
I am reaching out because this is the first time I am sizing a cluster using Proxmox VE and Ceph. Since I lack direct experience with these solutions, I would appreciate your feedback on my proposal.
The cluster will support a VDI environment. Based on the workload profile we are targeting, these are the resources required for 100 concurrent users:
I was unable to find a Ceph sizing calculator that accounted for both usable capacity and resource overhead (CPU/RAM consumption), so I developed the following calculations based on the Ceph sizing parameters I researched.
Proposed servers resources
CEPH Sizing
RAW Storage TB (6 x 19,2): 115,2
Number of replicas: 3
Usable Storage TB: 38,4
Safe nearfull ratio: 0,8
REAL Usable Storage TB: 30,72
CEPH RAM requirements: 180
CEPH CPU requirements: 30
Based on the Ceph resource consumption shown above, these are the total estimated resources required for the cluster:
100 Users + CEPH resource requirements
And here are the total hardware specs for the proposed 6-node cluster:
Cluster HW Total resources
I am assuming a hypervisor overhead of approximately 10% for Proxmox.
The following represent the usable resources after accounting for the overhead and maintaining a reasonable 80% utilization threshold:
If my sizing is correct, the total node count should be 6 + 1 for High Availability, taking into account the workload requirements and Ceph resource consumption.
Regarding networking:
Are there any additional considerations or best practices I should keep in mind for this specific sizing?
Thanks for your help.
I am reaching out because this is the first time I am sizing a cluster using Proxmox VE and Ceph. Since I lack direct experience with these solutions, I would appreciate your feedback on my proposal.
The cluster will support a VDI environment. Based on the workload profile we are targeting, these are the resources required for 100 concurrent users:
- vCPU: 910
- RAM GB: 3.500
- Disk TB: 20
- GPU GB: 800
I was unable to find a Ceph sizing calculator that accounted for both usable capacity and resource overhead (CPU/RAM consumption), so I developed the following calculations based on the Ceph sizing parameters I researched.
Proposed servers resources
- CPU (2 x 64 Cores) (vCPU 2 to 1): 128
- RAM GB: 1.024
- Disk TB (6 x 3.2TB NVME MU): 19,2
- GPU GB (NVIDIA L4 24GB): 192
CEPH Sizing
RAW Storage TB (6 x 19,2): 115,2
Number of replicas: 3
Usable Storage TB: 38,4
Safe nearfull ratio: 0,8
REAL Usable Storage TB: 30,72
CEPH RAM requirements: 180
CEPH CPU requirements: 30
Based on the Ceph resource consumption shown above, these are the total estimated resources required for the cluster:
100 Users + CEPH resource requirements
- vCPU: 970
- RAM GB: 3.680
- Disk TB: 20
- GPU GB: 800
And here are the total hardware specs for the proposed 6-node cluster:
Cluster HW Total resources
- vCPU (2 to 1): 1.536
- RAM GB: 6.144
- Disk TB: 38
- GPU GB (NVIDIA L4 24GB): 1152
I am assuming a hypervisor overhead of approximately 10% for Proxmox.
- vCPU (2 to 1): 1.382
- RAM GB: 5.530
- Disk TB: 30,7
- GPU GB (NVIDIA L4 24GB): 1152
The following represent the usable resources after accounting for the overhead and maintaining a reasonable 80% utilization threshold:
- vCPU (2 to 1): 1.106
- RAM GB: 4.424
- Disk TB: 30,7
- GPU GB (NVIDIA L4 24GB): 921
If my sizing is correct, the total node count should be 6 + 1 for High Availability, taking into account the workload requirements and Ceph resource consumption.
Regarding networking:
- Ceph Back-end/Storage Traffic: I am planning to use two dual-port 25GbE NICs.
- Front-end/Client Connectivity: I am planning to use two dual-port 10GbE NICs.
Are there any additional considerations or best practices I should keep in mind for this specific sizing?
Thanks for your help.
Last edited: