updating nodes with ceph

ethaniel86

Member
Oct 6, 2015
17
0
21
Is there a KB we can refer on how to properly update a cluster with ceph storage? Do we update node one at a time, or we can do it in parallel? Thanks!

We are on Proxmox 5.3 currently updating to latest release 5.4
 
In general I suggest upgrading one node at a time, and moving all containers and VMs off that node for the time being. Especially when it comes to kernel upgrades you want to reboot anyway. :)

Try to move VMs and containers to a node that has the same or a newer stack, not the other way round, to avoid potential side effects. There shouldn't be any between 5.3 and 5.4, but this is general advice.
 
alright thanks. Is there known issues with cloud-init and 5.4? Most of our OS templates depend on cloud-init for provisioning.
 
We are unaware of new issues with 5.4 and cloud-init. There has been some things fixed, but if you used it with 5.3 it very well is expected to work the same with 5.4 too.
 
In general I suggest upgrading one node at a time, and moving all containers and VMs off that node for the time being. Especially when it comes to kernel upgrades you want to reboot anyway. :)

Try to move VMs and containers to a node that has the same or a newer stack, not the other way round, to avoid potential side effects. There shouldn't be any between 5.3 and 5.4, but this is general advice.

I realize the above advice has been the "official" response and consistent among staff responses. However, I'm concerned with "unnecessary" rebalancing of PGs triggered by rebooting a node (and related side effects). I would like to propose the following for comment by staff and members with more extensive experience with CEPH:

Code:
# Node maintenance

    # stop and wait for scrub and deep-scrub operations

ceph osd set noscrub
ceph osd set nodeep-scrub

ceph status

    # set cluster in maintenance mode with :

ceph osd set norecover
ceph osd set nobackfill
ceph osd set norebalance
ceph osd set noout


    # for node 1..N

    #  migrate VMs and CTs off node

(GUI or CLI)

    # run updates

apt update && pveupgrade

    # determine which OSDs on node, and verify OK to reboot

systemctl status system-ceph\\x2dosd.slice

ceph osd ok-to-stop <id> [<ids>...]

shutdown -r now

    # wait for node to come back on line and quorate

    # next N

# restore

ceph osd unset noout
ceph osd unset norebalance
ceph osd unset nobackfill
ceph osd unset norecover



    # when all the PGs are active, re-enable the scrub and deep-scrub operations

ceph status

ceph osd unset noscrub
ceph osd unset nodeep-scrub


    # done

Thanks
 
Last edited:
@RokaKen normally it's enough to set noout, for scrubbing you should define some time ranges for the night or other times where is not much traffic.

On my setup I only set noout (which I have set generally because of the Cluster size) and don't have any problems up to today with this. There was no trouble in the cluster, if you have an correct crushmap there should be no reason for problems. If you have problems at planned maintenance, what are you doing if a node fails?
 
@RokaKen normally it's enough to set noout, for scrubbing you should define some time ranges for the night or other times where is not much traffic.

Thanks @sb-jw I was considering limiting the scrub time ranges, but those ranges will also coincide with planned maintenance windows most of the time. As for the rest, I didn't have any specific problem -- more my perception of unnecessary churn in cluster by dropping multiple OSDs during a node reboot. I realize that most options beyond 'noout' are overkill, but might provoke discussion. Node failure scenarios should probably be a different thread/topic.
 
The default timeout before recovery starts, is 600sec. If the nodes boot faster then it shouldn't trigger a rebalance, but every new object written will be distributed either to other nodes or will be copied after the nodes is back up. In general data movement in some way or another is always to be expected. But as @sb-jw said, it should not be a problem otherwise there is an issue with the cluster per se.

EDIT: Ofc, setting noout is a good measure to do.
 
  • Like
Reactions: RokaKen

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!