Question for PVE Staff/Devs - Enabling cephadm, what are the known (with evidence) problems?

BloodyIron · 2024-11-07T19:51:57+0100

Hey Folks,

Hoping to get some response from the Proxmox Staff/Devs on this. Namely looking for evidence of known problems in a particular scenario.

I'm currently building out a PoC for 3x PVE cluster, whereby it also provisions a Ceph cluster, and I extend that Ceph Cluster to provide NFS HA Active/Active/Active via NFS Ganesha and other related components.

Now, I've seen in years 2022/23/earlier comments from PVE staff/devs that enabling any orchestrator (cephadm/rook/whatever) is likely to cause problems when PVE provisioned a Ceph cluster via pveceph/related. However each time I try to drill deeper into this, I don't really find any evidence of what actual problems have been observed in this scenario, and whether it is based in speculation or actual evidence.

So, while I can _speculate_ on how it _MIGHT_ cause problems. I really do want/need to hear what tangible problems have been observed in such a scenario (with evidence).

Also, at this stage of my PoC I'm using Ceph v17.x, and later stages will involve an in-place upgrade v17.x -> v18.x, so maybe things get better in that?

I've also found threads suggesting enabling cephadm may be "just fine", but that's also kinda loosey-goosey on evidence too (apart from their single implementation instance).

The function here is to extend the Hyper-Converged Infrastructure value that Ceph initially provisioned by PVE provides. I firstly care about an HA NFS endpoint for kubernetes usage, but also so that I can re-use the same NFS endpoint (VIP) for not-kubernetes stuff throughout target environment (as in, not the PoC environment).

Just so it is said, there is ZERO valuable data to lose in my PoC environment, this has been built from scratch to identify gotchas, validate, test, explode, succeed, make scary noises, whatever. And I am fully prepared to rebuild it from scratch as many times as it takes to work out all the kinks (not that I want to kink shame).

I can also appreciate that enabling an orchestrator may or may not cause PVE related updating complexities, so hoping maybe we can work together to overcome $futureProblems too.

Either way, hoping to hear from Proxmox peeps for tangible evidence. I am really wanting to _AVOID_ speculation here, I have enough tabs of that right now.

Thanks!

Search

Search

Question for PVE Staff/Devs - Enabling cephadm, what are the known (with evidence) problems?

BloodyIron

Renowned Member