Proxmox Ceph without PVE Cluster

brosky

Well-Known Member
Oct 13, 2015
55
4
48
Hi,

I have a Proxmox Cluster , 14 nodes - with Ceph.
Is it possible to add additional Ceph members (with PVE installed) that are not members of the clusters ?

I wonder real life setups of large clusters - if there are any issues/
 
Is it possible to add additional Ceph members (with PVE installed) that are not members of the clusters ?
Only the other way around - PVE as node in cluster without ceph to only run VMs and nothing more or without VMs as quorum-only node.

But I don't understand the problem, having a node in a cluster alone isn't generating any load (besides the corosync traffic and some pings) .
 
Last edited:
@mr44er as I understand, there are issues with large pve clusters - over 16 machines - due to increased chatter on the cluster vlan.

my scope here is to increase the ceph cluster without increasing the PVE cluster
 
From where is this info? Yes, problems arise when corosync ping times go up or time out over 60 seconds between nodes.
To prevent this, a dedicated (and redundant) network with enough bandwidth (1GBit is enough) for corosync alone is recommended.

If you run this in a vlan, other traffic on the same cable will steal the bandwidth and drive pings for your corosync-vlan to the sky.
You maybe can work around with traffic limits, but an extra net with extra bandwidth is in every way the better approach.
 
There is no hard limit how many Proxmox VE nodes can form a cluster. Up to 25 Nodes is what we hear works well. Above, you might run into some issues that need some Corosync tuning. But again, no hard numbers and with more powerful hardware, some of the issues that show up in larger clusters also disappear. So please don't take this number as a hard fact when future you reads this in 5 or 10 years. ;)

To the original question. If we are talking about a Proxmox VE + Ceph (HCI) cluster, then all nodes that are part of the Ceph cluster need to be part of the Proxmox VE cluster as well. The ceph.conf file is shared via the pmxcfs (/etc/pve) with all the other nodes, and certain Ceph specific API calls on the Proxmox VE layer need to be relayed to the actual node.
 
Thank you , all clear.

What's the latency where corosync behaves badly ?
 
Again, it is hard to say a hard limit. With a larger cluster, the timeout increases, for example. The recommendations you see are good rule of thumbs.
The man page for corosync.conf explains the config options quite well. You want to look at the knet_ping_* and token options to get a better idea how timeouts are calculated. The rule of thumbs are useful to keep it low enough as they add up with each trip needed.

Another thing to keep in mind, in very large clusters, is CPU speed. If the CPU is too slow to keep up, you will also see issues.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!