Shutdown of the Hyper-Converged Cluster (CEPH)

albert_a

Well-Known Member
Mar 22, 2018
43
6
48
41
Hi,

Can someone explain how to shutdown the hyper-converged cluster properly?
I suppose the steps should be as follows:

1. Shutdown all VMs on every node
2. Set the following flags:
# ceph osd set noout
# ceph osd set nobackfill
# ceph osd set norecover
3. Shutdown the nodes only after All the VMs on all nodes was shut down.
4. Restore CEPH flags when all the nodes boot up.

If this sequence is correct, then I have a second question.
What is the proper way to perform step 1? How to shutdown both HA-managed and non-HA managed VMs on the node?

- Stopping of pve-manager is not allowed:
# systemctl stop pve-manager
Failed to stop pve-manager.service: Operation refused, unit pve-guests.service may be requested by dependency only (it is configured to refuse manual start/stop).
See system logs and 'systemctl status pve-manager.service' for details.

- Stopping via pvesh only affects non-HA VMs:
# pvesh create /nodes/localhost/stopall

Best regards,
Albert
 
Well, you usually don't shutdown the whole cluster, especially since you have/want HA. :)

# ceph osd set noout
I don't recommend to set this, since all nodes will boot again and may or may not start properly.

What is the proper way to perform step 1? How to shutdown both HA-managed and non-HA managed VMs on the node?
You can use freeze as shutdown policy, so services don't move.
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_node_maintenance
 
I don't recommend to set this, since all nodes will boot again and may or may not start properly.
Thanks for advice, it might be reasonable in some circumstances. Currently I have to set it even if some nodes fail to start, and remove the flag only during in the period of minimum cluster load.
Well, you usually don't shutdown the whole cluster, especially since you have/want HA. :)
You are not serious. Do you?) There are numerous reasons when you need to shut down the cluster. Force majeure, security reasons, maintenance, energy saving, staff negligence, and much more, no talking about power outages within small companies.
You can use freeze as shutdown policy, so services don't move.
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_node_maintenance
I can. But how is it it related to my questions?

OK, taking into account the fact that nobody gives the answers, and also my own searches, it's clear that Proxmox does NOT support hyper-converged clusters natively. Although they are mentioned in the manual. I think some scripting is required to make Proxmox support them.
 
Thanks for advice, it might be reasonable in some circumstances. Currently I have to set it even if some nodes fail to start, and remove the flag only during in the period of minimum cluster load.
Even with a whole cluster shutdown/boot up, the OSD recovery load should not impact the operation of the cluster.

I can. But how is it it related to my questions?
Your question was:
How to shutdown both HA-managed and non-HA managed VMs on the node?
With the shutdown policy set to freeze, services (VM/CT) will not move to other nodes while you shut down all nodes in a cluster. All VM/CT will be shut down (as long as ACPI call or guest-agent work) as well. You can also trigger a bulk action before the shutdown.

OK, taking into account the fact that nobody gives the answers, and also my own searches, it's clear that Proxmox does NOT support hyper-converged clusters natively. Although they are mentioned in the manual. I think some scripting is required to make Proxmox support them.
What are your specifics that say otherwise? Running Proxmox VE + Ceph is hyper-converged.
 
Well, you usually don't shutdown the whole cluster, especially since you have/want HA. :)

I have done some research, but I am still confused as to how I can turn off a Proxmox-HE cluster with Ceph,
from a script that runs on low UPS battery safely and without race conditions.

There has to be a better answer than "never shutdown the whole cluster".

Is it as simple as setting VM migration policy to "freeze" and then running "shutdown" on each host node?
Or will that trigger a quorum race condition if not all nodes shutdown before they notice they lost quorum.

PS: backlink to a reddit thread that I also found interresting: https://www.reddit.com/r/homelab/comments/5rb6vi/cluster_shutdown_script_for_pve_ceph_cluster/
There the author stops pve-manager.service, contrary to what albert posted, that this gives him " Failed to stop pve-manager.service:" errors.
 
Last edited:
Hi Albert,

I have now the same question, because we are moving into a new building. So PVE cluster have to shut down and reboot at the new location.

So I like to ask: Was your solution successful?

best regards
Martin Bork
 
  • Like
Reactions: Xeata_James
Well, this seems a bit overblown for a hyper converged setup, since with a shutdown of the nodes you automatically shut down the Ceph services as well.
In my opinion it should be enough to shut down all VMs and reboot every node.
 
Thank you for your response, ph0x and PayStaion.

But the question is the shutdown and later boot up of al nodes at the same time. So shutdown on by one and move all together physical.

Shutting down and boot again - for kernel updates - one after the other that is easy.

Also the definition:

  • Shutdown your service nodes one by one
  • Shutdown your OSD nodes one by one
  • Shutdown your monitor nodes one by one
  • Shutdown your admin node

In Proxmox, what is the admin and service node - any?

Monitor is clear OSD no problem. For me it seams that this question have to be clear and the proxmox team shoult bring clarity.
 
OP's originally stated method is the best method. If you don't set Ceph flags, the cluster will begin to rebalance as soon as OSD are marked out. If there is more than a five minute time between first shutdown and last shutdown of all the nodes, or similarly on restart, you will be dealing with an unnecessary amount of rebalancing. Granted this depends on your Ceph failure domain in the crushmap, but it defaults to the node (host) when using Proxmox. Once all nodes are back up and the cluster is stable, clear the Ceph flags. The very small amount of backfilling and rebalancing will then occur.

Caveat: This assumes 3x replicated pools and not EC. You may be in for a totally different animal with EC pools.
 
Thanks. But alwin said that " ceph osd set noout " : I don't recommend to set this, since all nodes will boot again and may or may not start properly.

So, i need to do or not?.

Thanks
 
Based on guidance from Ceph and experience, this is the best method. If you do not set the Ceph flags and a node does not boot properly within five minutes (this is a tunable which could be changed from default), the OSD's on that node will be marked out. The rest of the Ceph cluster will begin to rebalance. When the last node finally comes back online, you will have a large amount of unnecessary rebalancing occurring. The OSD's didn't fail and they have good data, but Ceph doesn't know they were unavailable as opposed to truly failed. Using flags is the Ceph method for performing maintenance. You don't "have" to use Ceph flags, but not doing so will trigger lots of data movement you can avoid.
 
Based on guidance from Ceph and experience, this is the best method. If you do not set the Ceph flags and a node does not boot properly within five minutes (this is a tunable which could be changed from default), the OSD's on that node will be marked out. The rest of the Ceph cluster will begin to rebalance. When the last node finally comes back online, you will have a large amount of unnecessary rebalancing occurring. The OSD's didn't fail and they have good data, but Ceph doesn't know they were unavailable as opposed to truly failed. Using flags is the Ceph method for performing maintenance. You don't "have" to use Ceph flags, but not doing so will trigger lots of data movement you can avoid.
Ok, perfect. Thanks for your time.
 
Usually you cant shutdown VMs because HA will automatically move them to a different node. So how to shut dem down properly with HA enabled?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!