Ceph Design Recommendations

millap

Member
Nov 13, 2019
4
0
21
50
Hi Guys,

We came across Proxmox a couple of months ago as a viable alternative for VMware on a managed solution we sell. As it worked amazingly well (minus the costs of VMware on-top), we've decided that our ageing VMware production cluster (3 x Dell R510 with Equallogic PS4000 storage) could be replaced by Proxmox. Having paid for VMware vCenter, VSAN and associated licensing for a newer lab VMWare cluster ( 2 x R730 with quorum node), the costs of Proxmox are obviously much more competitive, and the feature-set of Ceph mimics some of those in VSAN. Add Ansible management into it, and our needs for automation are also covered. To that end, I wonder if anyone can offer advice on how to best configure Ceph on the following hardware -

4 x
Dell R440
1 x Xeon 2801 Silver
128GB Memory
8 x 2TB NL SAS
2 x 930GB Mixed Use SAS
Dual Port 10G Broadcom NIC
Dual-Port 1G on-board NIC


We've already built the cluster of 4 nodes, using the 10G ports in LACP (Openvswitch bond) with a Virtual Chassis of two Juniper EX4300-MP switches. Therefore resilience in-case of an EX failure is covered -

Storage-Networkv1.png

The two remaining NICs (1G) are also LACP (Openvswitch bond) with the cluster management through these interfaces. The OS is currently installed on one of the 930GB Mixed Use SSD drives (400GB partition).

How would anyone recommend the Ceph storage be setup? We'd like most of the storage for VM's, but do foresee container use too in the future.

Best regards
Andy
 
Hi Rob,

Thanks a lot for your response. The products look interesting, but as mentioned, we've already bought the Dell servers with the hardware specified. I'm interested in how best to get Ceph setup on said hardware. For example, can we add the remaining 400+GB on the OS SSD for use as Ceph Journal. Is it best to make the other 930 SSD a WAL-only disk, etc, etc?

We do already have a Ceph pool configured, but I wanted to hear the suggestions of how best to use/configure it from the Proxmox community. The deployment is in a test phase at the moment before we start migrating Windows/Linux/AN-Other product to the Proxmox platform.

Best regards
Andy
 
1ST I am not a ceph expert. A lot depends on how much disk i/o will occur.

I'd say you need very fast ssd to use for journals. We have not used journals for a long time so I can not give advise on how to set that up.

if the system gets laggy you may want to consider using nvme for journal or switching over to an all ssd system.

Hopefully you get more responses from some who know more on using journals. Good Luck!
 
@millap
nice HW network setup for ceph!
what is your lacp config?

if you have questions to ceph setup , you need to give some more details...
what ssd/hdd do you have ... or are you just start planning?
what do you need to achive (any use cases or requiremnts?)
 
@millap
nice HW network setup for ceph!
what is your lacp config?

if you have questions to ceph setup , you need to give some more details...
what ssd/hdd do you have ... or are you just start planning?
what do you need to achive (any use cases or requiremnts?)

@akxx

Thanks for the response.

The SSD's in each R440 are -

2 x 930GB Mixed Use SSD 12Gbps

One of those is already partitioned for the Proxmox OS.

There are then the following in each node -

8 x 2TB NL SAS

The LACP configuration is currently this on the EX's -

set interfaces ae0 description "*** AE0 - PM00 STORAGE ***"
set interfaces ae0 aggregated-ether-options lacp active
set interfaces ae0 unit 0 family ethernet-switching vlan members servers-san
set interfaces ae1 description "*** AE1 - PM01 STORAGE ***"
set interfaces ae1 aggregated-ether-options lacp active
set interfaces ae1 unit 0 family ethernet-switching vlan members servers-san
set interfaces ae2 description "*** AE2 - PM02 STORAGE ***"
set interfaces ae2 aggregated-ether-options lacp active
set interfaces ae2 unit 0 family ethernet-switching vlan members servers-san
set interfaces ae3 description "*** AE3 - PM03 STORAGE ***"
set interfaces ae3 aggregated-ether-options lacp active
set interfaces ae3 unit 0 family ethernet-switching vlan members servers-san
set interfaces mge-0/0/24 description "*** AE0 MEMBER ***"
set interfaces mge-0/0/24 ether-options 802.3ad ae0
set interfaces mge-0/0/25 description "*** AE1 MEMBER ***"
set interfaces mge-0/0/25 ether-options 802.3ad ae1
set interfaces mge-0/0/26 description "*** AE2 MEMBER ***"
set interfaces mge-0/0/26 ether-options 802.3ad ae2
set interfaces mge-0/0/27 description "*** AE3 MEMBER ***"
set interfaces mge-0/0/27 ether-options 802.3ad ae3
set interfaces mge-1/0/24 description "*** AE0 MEMBER ***"
set interfaces mge-1/0/24 ether-options 802.3ad ae0
set interfaces mge-1/0/25 description "*** AE1 MEMBER ***"
set interfaces mge-1/0/25 ether-options 802.3ad ae1
set interfaces mge-1/0/26 description "*** AE2 MEMBER ***"
set interfaces mge-1/0/26 ether-options 802.3ad ae2
set interfaces mge-1/0/27 description "*** AE3 MEMBER ***"
set interfaces mge-1/0/27 ether-options 802.3ad ae3


Interfaces on each PM node is configured like so -

allow-vmbr0 bond0
iface bond0 inet manual
ovs_bonds eno1 eno2
ovs_type OVSBond
ovs_bridge vmbr0
ovs_options bond_mode=balance-tcp lacp=active other_config:lacp-time=fast

allow-vmbr1 bond1
iface bond1 inet manual
ovs_bonds ens1f0np0 ens1f1np1
ovs_type OVSBond
ovs_bridge vmbr1
ovs_options other_config:lacp-time=fast bond_mode=balance-tcp lacp=active

auto lo
iface lo inet loopback

iface ens1f0np0 inet manual

iface ens1f1np1 inet manual

allow-vmbr0 vlan13
iface vlan13 inet static
address 10.10.13.2
netmask 255.255.255.0
gateway 10.10.13.1
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=13
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-vif

allow-vmbr0 vlan200
iface vlan200 inet manual
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=200
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-vif

iface eno1 inet manual

iface eno2 inet manual

allow-vmbr1 servers_san
iface servers_san inet static
address 10.10.12.2
netmask 24
ovs_type OVSIntPort
ovs_bridge vmbr1

allow-ovs vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_ports bond0 vlan13 vlan200

allow-ovs vmbr1
iface vmbr1 inet manual
ovs_type OVSBridge
ovs_ports bond1 servers_san


We're just looking for some sage advice from those who've done it before on how best to get speed/performance out of the hardware we have.

Cheers
Andy
 
Last edited:
I assume you have the 10 bay version. Personally I'd try and put Proxmox on a 3.5 bay maybe on an SSD or a mirrored SSD that is much smaller, then setup 2 pools with the 930GB pairs, span across 4 nodes in 3/2 and 8x2TB in a pool across all nodes as well. Setup the storage network to be on 10Gb and make sure your VMs are bound to that bridge.

I dont know what type of drives you have but if they're SSD or mechanical you can have a performance or storage pool type. I used HDD for media pool (large capacity 20TB good for static storage/object storage) and SSD and NVMe pools (mixed 1.5TB and 8TB respectively) for performant and high performant pools to run VMs.

Trying to leverage another partition on a system disk for Proxmox to me is a pain. I did it but it just felt like a hack, and not worth the effort or maintenance. Just get a few more disks and then you'll have 2 pool types.

You also have an even number of nodes which I think isn't ideal as quorum can be lost?

I'm relatively new to Proxmox, or coming back to it from 7 years ago and it's great. I run a 5 node cluster with
  • 3 nodes with low clock/high count cores and lots of RAM,
    • (2.1GHz/8c/16t/128GB
    • 4x10TB HDD (hdd-pool)
    • 1x120GB SSD (system)
    • 1x500GB SSD (ssd-pool)
    • 3x2TB NVMe (nvme-pool)
  • 2 nodes high clock/less cores and low RAM.
    • (3.8GHz/4c/8t/16GB)
    • 1x120GB SSD (system)
    • 1x500GB (local-lvm)
      • Dont want CEPH on these other than mon/man nodes as the RAM max is small.
  • All nodes can access the CEPH volumes of course and most run VMs on the pools. For quick throw away vms i'll do it on local disks.
  • Man/Mon on all 5
I dont really know your use case but to me this is a pretty good spread of the resources.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!