Building Proxmox HCI Proof of Concept - Dell R240

bbx1_

Active Member
Nov 13, 2019
17
2
43
40
Ontario Canada
tweakmyskills.it
Hey everyone, I am looking to build a small 3-node cluster to test at work for small scale deployments. A Proof of Concept (PoC) pretty much.

I plan on testing a hyper-converged configuration, no shared iSCSI storage for now.

I have three Dell R240 servers to use. These are servers I have unused and in my stock, so I'm re-purposing them for Proxmox.

Server specs:
  • single Xeon E-2236 3.4GHz with 64GB of memory.
  • Dell Boss-S1 card with 2x 256GB SSDs in RAID-1 for Proxmox OS installation
  • 4x Dell enterprise 960GB SSDs (Storage) (NO Dell RAID used for storage SSDs)
  • 2x 1GB NICS (Onboard)
I will plan to add higher bandwidth NICS to these servers. I can do 10GB easily or we can look at 40GB NICs.

I am not sure if I should connect the 10/25/40GB NICs of each host to the other hosts or if I should utilize a 10/25/40GB switch and have all hosts connect back there. I think that if I were testing CEPH, it is best to use switches but if I do ZFS, would I be fine to interconnect the servers together without a switch?

I was thinking of configuring the 4 SSDs in each server for ZFS but I'm not sure what ZFS configuration I would want. I think from what I've read that ZFS would be more ideal with my setup than CEPH (due to the amount of disks).

As this is a Proof of Concept and will only see light business duty for IT staff, there is no real risk here. If/when we decide to move to Proxmox, the hardware will be current and properly built with a 3rd party vendor.

For those of you that are also testing Proxmox in production environments, any suggestions for my setup? Is ZFS a good file system to use for our storage?

Any suggestions on what I can look at doing to this potential build? We have all of the hardware so we would like to use what we have, as this is a Proof of Concept.

I am coming from a heavy Dell-VMware environment with shared iSCSI storage.

I will gladly document my build here and progress if it helps others.
 
Last edited:
A 3-node Proxmox Ceph PoC does work. Don't recommend it in production in which you want a minimum of 5-nodes, so can lose 2 nodes and still be in business.

I do have BOSS-S1 setup in ZFS RAID-1 to boot Proxmox. I use a Dell HBA300 for "true" IT-mode functionality.

While 64GB will work for a PoC, obviously the more RAM the better. The production clusters I manage have 512GB RAM per node.

Ceph can work at 1GbE but faster is better. For a 3-node PoC with a 4-port NIC, you can do a full-mesh broadcast network per https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server and skip the switch. Otherwise get a dedicated switch. I use 2 dedicated switches MLAG'd together for Ceph public, private, and Corosync network traffic. Not recommended but works in production.

Been working with Proxmox since version 6 when VMware/Dell dropped official support for 12th-gen Dells. Just migrated three Dell 13th-gen 5-node VMware clusters to Proxmox Ceph earlier this year running the latest version of Proxmox.

As a bonus, don't need a 'vCenter' management instance to manage the cluster. Each Proxmox node in a cluster can manage each other. Win-Win.

Not hurting for IOPS. Workloads range from DBs to DHCP servers. All backed up to bare-metal Proxmox Backup Servers using ZFS using Dell HBA330.

I use the following optimizations learned through trial-and-error. YMMV.

Code:
    Set SAS HDD Write Cache Enable (WCE) (sdparm -s WCE=1 -S /dev/sd[x])
    Set VM Disk Cache to None if clustered, Writeback if standalone
    Set VM Disk controller to VirtIO-Single SCSI controller and enable IO Thread & Discard option
    Set VM CPU Type to 'Host'
    Set VM CPU NUMA on servers with 2 or more physical CPU sockets
    Set VM Networking VirtIO Multiqueue to 1
    Set VM Qemu-Guest-Agent software installed
    Set VM IO Scheduler to none/noop on Linux
    Set Ceph RBD pool to use 'krbd' option
 
Last edited: