[SOLVED] (safe/best) way to temporarily operate a cluster with only one node?

zaphyre

Member
Oct 6, 2020
55
4
13
35
Hi! I currently operate a production PVE7 cluster of three nodes for an internal development team. Due to energy saving demands my boss wants to run all VMs/CTs on one cluster node and temporarily shut down the two then empty nodes. As each of the nodes has plenty of RAM/ressources and we additionaly could switch off lots of VMs/CTs (but need to keep them), migrating and running everything on one node should practically work from a client/boss perspective.

When turning two third of the cluster nodes off the cluster has no quorum and is limited - not able to even perform backups. What is the recommended way of running a configuration like this? If possible I want to keep the cluster setup intact as it will definitely be reutilizied in the future.

Thanks a lot.
 
Maybe 3 nodes with two 2 very low power qdevices? Then when shutting down 2 of 3 nodes you would will got 3 of 5 votes and quorum.
 
  • Like
Reactions: zaphyre
Hi @Dunuin , thanks for your help! Sounds very interesting…

So, like putting two low power intel NUC or any other (more) energy efficient devices running PVE in the rack and add them to the cluster, for quorum only? Cool! :)

Having five nodes again would allow me to temporarily shut down the big iron - and just get rid of the extra NUCs when we switch on the other nodes again.

There is no option to tell the cluster that it is „just fine“ to run standalone, is it?
 
Some people increase the votes per node for maintaince. Lets say you give the main node 3 votes and two secondary nodes 1 vote. Then when shutting down the secondary nodes the main node still got 3 of 5 votes. But this can be of cause a problem when all 3 nodes are running and then the main node is failing. The remaining two secondary nodes then only got 2 of 5 votes and stop working.

And you don't even have to install PVE on the qdevices. They don't have to be full nodes. See here: https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support

Even a 2W Raspberry Pi will work as a qdevice but a NUC or better a Supermicro/Dell/HP/Lenovo Thin-client with ECC and mirrored boot disks of cause would be a more reliable choice.
 
  • Like
Reactions: zaphyre
Is there a way to set "pvecm expected 1" permanently, so that it will be considered even after a restart?
IMO high availability is degraded in situation when quorum is not met due to unavailability of some node(s). This might not be a problem in clusters with a huge amount of nodes, but if I have a small cluster (2 nodes), this is often a problem.
 
Is there a way to set "pvecm expected 1" permanently, so that it will be considered even after a restart?
IMO high availability is degraded in situation when quorum is not met due to unavailability of some node(s). This might not be a problem in clusters with a huge amount of nodes, but if I have a small cluster (2 nodes), this is often a problem.
you really don't want to have it permanently with 2nodes + HA.

if you have a split brain, the vm will be restarted on the other node, but still running on initial node. If the storage is shared, you'll destroy the vm filesystem.

(and destroy /etc/pve is process are writing on both nodes in the folder at the same time).

pvecm expected 1 need to be used manually, when you are sure than 1 of the 2 nodes is dead.
 
  • Like
Reactions: Dunuin
you really don't want to have it permanently with 2nodes + HA.
I really want to have it, at least for the begining, 2nodes + HA. I am not running industrial scale hosting operations so I don't see the reason to invest in more machines just for this, without knowing if the app will work or not. Nor do I have people hired to monitor the systems 24/7 . And like me there are probably others.

And the fact that the cluster is not able to operate properly in a single node, is a problem. And I can give you a simple scenario, which is very plausible, in which an initial 2 nodes cluster will go down for good, until someone will intervene to fix it:

let's consider you have a 2 nodes cluster, and after a power failure, one node dies due to lets say a high power spike, while the second one survives and manages to restart. At this point even thoug you have a nore running, it won't be able to restart the VMs due to the missing quorum.

So IMO, you can leave by default the requirement for quorum to be at least 2 running nodes, but offer the posibility to whoever wants, to run in a single node if the quorum is not available (and this without having to punch in the "pvecm expected 1" command each time).
 
If possible I want to keep the cluster setup intact as it will definitely be reutilizied in the future.
What for? adding/removing nodes from a proxmox cluster is trivial; if you're not going to be getting any benefit from having cluster resources, why bother having offline nodes?

BTW, how are you handling cluster storage?
 
let's consider you have a 2 nodes cluster, and after a power failure, one node dies due to lets say a high power spike, while the second one survives and manages to restart. At this point even thoug you have a nore running, it won't be able to restart the VMs due to the missing quorum.

With only two nodes in a cluster, you have to set up a QDevice: [1], which @Dunuin above already mentioned.
This is a requirement for a proper working two-node-cluster; even more, but not only, with HA.

[1] https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!