New install. Proxmox + Ceph cluster config questions.

Deeh514

New Member
Oct 15, 2024
1
0
1
Greetings Proxmox community!

I'm planning to migrate my VMs, mostly web servers, from VMWare to Proxmox sometime next year.
Used this opportunity to try Ceph, which I heard a lot about.

For now, I've built a 3 node cluster setup to test HA and Ceph.
Proxmox version: 8.2.7
Ceph version: 18.2.4 reef (stable)

Each server has:
-2x500GB SSD drives in RAID1 for PVE install
-2x1TB SSD drives for Ceph each drive exported as RAID0
-4x1G ports: 1 management, 2 (LACP) for VMs
-4x10G ports: 2 (LACP) for Ceph cluster network & 2 (LACP) for CephFS public network
-All 10G traffic is configured with jumbo frames

Created manager, monitor, ceph, OSD and MDS for each node.
Created ceph pool "my-ceph-pool" and CephFS "my-cephfs"
Added "mds_standby_replay = true" to ceph.conf under each MDS.
Set HA shutdown_policy to migrate

VM setup:
-Two Rocky 9 VMs stored on "my-ceph-pool".
-Each VM has access to 10G LACP for CephFS.
-Inside each VM "my-cephfs" is mounted under /mnt/cephfs
-HA is enabled on each VM.

Simulating a graceful reboot/shutdown of a node/Active MDS:
-VMs migrate without any interruptions prior to reboot/poweroff
-CephFS fail-over takes approx 75 secs

Simulating a node/Active MDS power failure:
-VMs on the node take approx 180 seconds to fail-over
-CephFS fail-over takes approx 75 secs

Stopping Active MDS manually
-CephFS fail-over takes approx 3-4 secs

Proxmox HA on VMs is a bonus, eventually VMs will have their own Master/Slave failover mechanism (keepalived) from within.
My main priority is performance and uptime of the CephFS share.
I'll probably end up moving VMs to local LVM, get rid of "my-ceph-pool" and use Ceph strictly for CephFS.
Eventually, the cluster will have 13 identical nodes and have 15 or so VMs connecting to the CephFS share.

With all that in mind, I was hoping to get answers to some questions:
1. Is there an option similar to HA's "shutdown_policy:migrate" to put MDS in “stopped” status prior to reboot/poweroff?
2. Reading Ceph docs I stumbled upon: "mds_reconnect_timeout" (default 45s) and "mds_beacon_grace" (default 15s). Has anyone had experience tweaking those settings? Didn't find much info on these forums.
3. I also read about having multiple MDS per single CephFS, are there any (dis)advantages to it?
4. A similar question for multiple CephFS, is it preferable to have one CephFS share with 8 subfolders or 8 CephFS shares?

I'm still new to Proxmox and Ceph, so any other recommendations and improvements are more than welcome.

Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!