[SOLVED] A smoother Ceph reset experience

Mar 16, 2024
17
1
3
I really like what Proxmox VE has done for creating a cluster. I'd *prefer* that there be a smoother reset experience for removing and re-adding a node to a cluster than going to the command line, but I can live with hand-jamming a one-liner. Caph is *not* like that - and in general I've found that making a situation easy to get "into the ditch" and hard to get back out of it is *not* a winning product strategy.

Along those lines I've been working on setting up a Ceph configuration that has its own 10G network. Cool, right? And hopefully useful, too. So I connect all of the machines to the cluster over 2.5G, and then I go into networking and set up the 10G ports "by hand"

1714511929460.png

So - thinking that I've gotten "past the hump" I proceed with the Ceph install - which in the Configuration panel brings up an error

Bash:
Multiple IPs for ceph public network '192.168.1.204/24' detected on host1:192.168.1.104/192.168.1.204 use 'mon-address' to specify one of them. (500)

So you have to bail out of that UI to hand-jam a value into the ceph.conf... OK. Except it's *not* OK.

Come back to the Configuration screen and everything is frozen. Step out of that screen by clicking away-from and back-to the Ceph installation dialog and a radosgw error pops up. So *then* I'm like "forget about it - I'll just reinstall and start over" only get get capped on applying the license.

Don't get me wrong - I'm not expecting a Windows-styled Fisher-Price toy experience. But what I *am* expecting is that if you go to the trouble of creating an install wizard experience - that stepping *out* of it cleanly in a known recoverable state should be "table stakes" for production release. [jumps off soapbox] ;)

To my question... is there anything in the screen grab above that is incorrect? And more broadly is there *any* way to use the Ceph install wizard in the web UI with anything other than a single network? If I had to abandon it completely to get the install done - I'm OK with that. Since I'm in a holding pattern waiting for my paid licenses to be reset I can certainly take time to do a deeper dive into Ceph. But I don't want to simply take the longer road if there's a detail missing that would let me get through the gauntlet with the "wizard experience". Thanks!
 
Last edited:
Thanks for the note. If it's not obvious I didn't realize that was a factor. I thought VLAN sequestration would do it, and would be something I would be able to apply after the initial setup was configured.

I don't really mind either way, I'm just trying to firm up the mental model on link layer versus network layer separation when it comes to setting up Ceph on dual-networked machines. Thanks again!
 
Last edited:
Thanks for the sage advice @gurubert ! For now I'm "keeping it simple" and putting everything on the 10G network and letting it ride.

1714790205619.png

Bash:
[global]
    auth_client_required = cephx
    auth_cluster_required = cephx
    auth_service_required = cephx
    cluster_network = 192.168.2.201/24
    fsid = 85620ac2-ba47-4252-9943-a5e331a4146e
    mon_allow_pool_delete = true
    mon_host = 192.168.2.201 192.168.2.202 192.168.2.203 192.168.2.204
    ms_bind_ipv4 = true
    ms_bind_ipv6 = false
    osd_pool_default_min_size = 2
    osd_pool_default_size = 3
    public_network = 192.168.2.201/24

[client]
    keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash]
    keyring = /etc/pve/ceph/$cluster.$name.keyring

[mon.spkez1]
    public_addr = 192.168.2.201

[mon.spkez2]
    public_addr = 192.168.2.202

[mon.spkez3]
    public_addr = 192.168.2.203

[mon.xan]
    public_addr = 192.168.2.204
 
Last edited: