[SOLVED] Could you please suggest optimal Proxmox HA cluster with Ceph NIC configuration?

Jul 4, 2022
64
9
13
Poland
Hi there.

I have a PX cluster built on 4 nodes, they have pretty much the same hardware spec which is
2xCPU sockets, 256GB of RAM and 3 network cards:
1x 2-ports 40Gb
1x 2-ports 10Gb
1x 4-ports 1Gb
1x 2,4TB SSD hardware RAID5
1x 2TB NVMe m.2

Because some configs are difficult to change and I don't want to rebuild my cluster again could you please suggest which card (port) should I choose for smooth cluster with HA and Ceph?

What my plan is.
1) I need 3x 1Gb ports for VMs
2) NVMe drives planned for CEPH. Because I don't expect this will be top performance I will put there some VMs which don't require fast harddrives
3) SSD drives which are at the moment on hardware RAID5 I will most likely have to reconfigure to ZFS. These are for VMs that will read and write a lot.
4) I want every VM to be in HA.

Which cards (ports) would you assign for:
1) Cluster
2) Ceph cluster
3) Ceph public

I have a Mellanox managed switch for my 40Gb network so I can configure LACP LAG.
 
Last edited:
Hi there,

I'd recommend the following setup (but there may also be other valid choices):
  • Ceph Public: 2x 10 Gb LACP Bond (LAG, 1 to each switch)
  • Ceph Cluster: 2x 40 Gb LACP Bond (LAG, 1 to each switch)
  • Corosync Link0 on a physical, dedicated 1 Gb Link
  • (best choice would be to use also a second Corosync Link to the other switch, but as I understood, you only want to use one dedicated interface for Corosync bc you need the other 3 for VMs. You could use a LAG of 2x 1Gb for your 3 VMs and use a second dedicated 1 Gb Link for Corosync Link1 over the second switch - best choice)
  • Corosync Link1 on Ceph Cluster
  • Corosync Link2 on Ceph Public (optional)
Other recommendations:
  • Don't use Ceph on top of a RAID
  • Don't use Ceph with a hardware RAID Controller
  • If you want to use local ZFS + Replication for HA, then keep in mind that the replication is asynchronous and if an HA switch happens, you automatically start an old version of your VM. Even if it'S only a minute old, your active directory or some databases don't like this.
  • While you can use a single disk in each host for ceph, this is not the best choice. In case one of your nodes fail, the self-healing feature of Ceph recovers the missing objects. In that case, the usage levels on your remaining 3 disks on the remaining 3 nodes will increase and maybe run full.
 
  • Like
Reactions: itret
  • Corosync Link0 on a physical, dedicated 1 Gb Link
  • (best choice would be to use also a second Corosync Link to the other switch, but as I understood, you only want to use one dedicated interface for Corosync bc you need the other 3 for VMs. You could use a LAG of 2x 1Gb for your 3 VMs and use a second dedicated 1 Gb Link for Corosync Link1 over the second switch - best choice)
  • Corosync Link1 on Ceph Cluster
  • Corosync Link2 on Ceph Public (optional)

Is there any option to configure Corosync Links in GUI, I cannot find any option?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!