Proxmox control node

IvanF · Mar 6, 2024

Hello people.
Pardon my English. I use a translator

Maybe I searched the wrong way and something like that was already resolved and I didn't find it. If yes, please forgive the duplication.

I plan to make a 5 node Proxmox cluster:
2 nodes - data center 1
2 nodes - data center 2
1 node - another place

My problem is with the last node.
It will serve exclusively as a control node for Proxmox and CEPH and no virtual machine will be run on it.

Everything is fine and clear. BUT, what if it goes wrong? How to effectively replace it (preferably automatically).
I am thinking about whether it is possible to make a Linux cluster and somehow include it as 1 control node. Is it possible to do something like that?

Thanks for any ideas.
Ivan

shanreich · Mar 6, 2024

Since you want to span the nodes across data centers: What is the latency between the two sites? Proxmox VE clustering only works with very low latency, even a few ms between the sites would be too much latency. This is particularly important if you want to use HA, since nodes will fence themselves as soon as corosync loses quorum.

Regarding the singular node: it would make sense to set it up as a QDevice and have a Ceph monitor on it as a witness. You can find more information regarding that on our wiki [1], where the procedure of removing / adding QDevices is covered as well.
I struggle to understand exactly what you mean with automatic replacement, but that is not something we offer - you would have to implement it yourself. You might also want to look into Ceph stretch clusters [2], but please carefully read the downsides before deciding on running one.

[1] https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
[2] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/

IvanF · Mar 6, 2024

Thank you for your response.
Latency is good. Now it is a 10G network and there is an upgrade to 100G in the plan.

I will definitely look carefully at the provided materials where you directed me. Thank you for that.

The set-up is designed so that there would be 2 servers in each of the 2 data centers. These would be like 1 Proxmox cluster together with the 5th node that would do the control (qdevice, CEPH monitor). But I wanted to have PVE installed there (simple installation of CEPH, management of the entire environment also from this server). My (psychological

) problem is that the 5th node is becoming SPoF. So I'm considering how to have it replaceable as efficiently as possible. That's why I thought about having it as a cluster and it would solve PVE...

EDIT:
Even after the network upgrade, the 5th node will only be connected with a 10g connection

bbgeek17 · Mar 6, 2024

IvanF said:
My (psychological ) problem is that the 5th node is becoming SPoF. So I'm considering how to have it replaceable as efficiently as possible. That's why I thought about having it as a cluster and it would solve PVE...

A failure of any of the 5 nodes will cause a loss of voting quorum in the cluster. So you will need to replace the failed node regardless.

If budget is of no concern, how about: install a 3 node PVE cluster in site 3, run the vote for first PVE cluster (the one split across 1 and 2) as a VM, make regular snapshots, as well as create an ansible playbook for full rebuild.

Some may say its a bit of overkill.

PS additionally - you can manage PVE from any of the nodes in the cluster, there is no "master gui" node.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

IvanF · Mar 6, 2024

Of course, the budget is of interest to every company, but that is not what is very limiting here.
3rd location is mini. The only good thing about it is that it has direct optical connections with locations 1 and 2. Therefore, it is suitable for the operation of an independent control node. There is no disk array and there will be no 100G switch.

Yes, that's an awesome feature of Proxmox. Changing from any node.
I want to have PVE with GUI on each of the nodes. Mainly for the case that a non-Linux colleague will have to deal with it. Qdevice cannot be changed much from the GUI.

bbgeek17 · Mar 6, 2024

IvanF said:
Everything is fine and clear. BUT, what if it goes wrong? How to effectively replace it (preferably automatically).
I am thinking about whether it is possible to make a Linux cluster and somehow include it as 1 control node. Is it possible to do something like that?

Theoretically, you can create a Corosync/Pacemaker Linux cluster, put the QDevice service under its control. I doubt something like this is documented. So if you do achieve it, feel free to post it on the forum as a tutorial for others.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

IvanF · Mar 6, 2024

Yes . I had something like that in mind. Just solve the CEPH monitor in this setting.

I will keep searching and trying.

Of course. I'll share my findings when I find a solution.

Thanks guys

LnxBil · Mar 7, 2024

IvanF said:
Latency is good. Now it is a 10G network and there is an upgrade to 100G in the plan.

How do those two relate? What do you mean by "latency is good" in numbers?

IvanF · Mar 8, 2024

Yes. 2 different things. Connection speed and transfer latency. My fault

But the latency between 2 Proxmox nodes looks fine.

rtt min/avg/max/mdev = 0.116/0.149/0.183/0.024 ms

LnxBil · Mar 8, 2024

IvanF said:
Yes. 2 different things. Connection speed and transfer latency. My fault
But the latency between 2 Proxmox nodes looks fine.

rtt min/avg/max/mdev = 0.116/0.149/0.183/0.024 ms

Thank you for the clarification, yes that looks good for a stretched PVE cluster. I just want us to be on the same page.

Search

Search

Proxmox control node

IvanF

New Member

shanreich

Proxmox Staff Member

IvanF

New Member

bbgeek17

Distinguished Member

IvanF

New Member

bbgeek17

Distinguished Member

IvanF

New Member

LnxBil

Distinguished Member

IvanF

New Member

LnxBil

Distinguished Member