[SOLVED] Ceph & iSCSI HA - How to configure the network?

benoitc

Member
Dec 21, 2019
173
8
23
I am looking to some guidance to finalize the setup of a 3-nodes Proxmox cluster with Ceph and shared ISCSI storage. While it's working, I am not really happy with the ceph cluster resilienc and I am looking for some guidance.

Each nodes have 2x10GbE ports and 2x480GB SSD dedicated for ceph (+2x256 NVME storage for the system and other things). The NAS has 1 10GbE and 2x5GbE ports (+2 1GbE) . Following some past discussion I have setup for now the following:

  • 2 distincts networks using 2 mikrotik CRS312-4C+8XG-RM switches with a great total non-blocking throughput of 120 Gbps, switching capacity of 240 Gbps and forwarding rate of 178 Mpps
  • The ceph d network only goes over 1 switch in 1 subnet
  • 2x5Gbe ports of the NAS are connected to 1 switch while the the 10GbE port is connected to the other
  • I am using a MULTIPATH session for iSCSI on each nodes.

IMG_0354-2.png

What is not on the picture is the backup server connected to each switch with 1 10GbE link that is supposed to also use the iSCI storage. Unfortunately I can't grow its storage that much since it has only 2x240GB SSD disks (2"5), except maybe by using USB3.


The issue with the design above is that while for iSCSI 2 different networks are really used, the CEPH network is only on one switch

Can we have the ceph network working on 2 different network using VLANs for the same OSDs/monitor nodes etc? I didn't find a way to do it in the web UI but maybe on the command line?

Also should I distinct the iscsi network from the ceph network and put them in distinct vlan?


Any hint/feedback is welcome :)
 
Last edited:
a couple more questions.

For now each nodes as I said has 2x256GB nvme M2 disks and 2x480GB SSD used foir ceph. the M2 card is using the only PCiE3.0 x8 possible extension. I am wondering I one better way to handle what I need woul dbe replacing this M2 card by 1 network card to extend the number of 10GB ports and replace the system disk by 1 NVME. and later replace the 480 GB disks by 1TB disks Thoughts?

Also I can probably use such extension to reuse the m2 disks : https://www.qnap.com/fr-fr/product/qda-a2mar

If I don't change the configuration, I am wondering if usiing LAG between the 2 switches and vlans would work?
 
Last edited:
For Ceph you will need to create a bond. Or a switched setup with STP and a path cost.
 
For Ceph you will need to create a bond. Or a switched setup with STP and a path cost.
You mean bonding between 2 switches and bonding between each interfaces? should I balance the traffic or use a an active-backup strategy?
 
You mean bonding between 2 switches and bonding between each interfaces?
I meant the interfaces, a LAG between the switches OFC wouldn't help.

should I balance the traffic or use a an active-backup strategy?
When two different switches are involved, then you may better use active-backup. Other modes will need further configuration on the switches.
 
  • Like
Reactions: benoitc

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!