updating nodes with ceph

ethaniel86 · May 22, 2019

Is there a KB we can refer on how to properly update a cluster with ceph storage? Do we update node one at a time, or we can do it in parallel? Thanks!

We are on Proxmox 5.3 currently updating to latest release 5.4

rhonda · May 22, 2019

In general I suggest upgrading one node at a time, and moving all containers and VMs off that node for the time being. Especially when it comes to kernel upgrades you want to reboot anyway.

Try to move VMs and containers to a node that has the same or a newer stack, not the other way round, to avoid potential side effects. There shouldn't be any between 5.3 and 5.4, but this is general advice.

ethaniel86 · May 22, 2019

alright thanks. Is there known issues with cloud-init and 5.4? Most of our OS templates depend on cloud-init for provisioning.

rhonda · May 22, 2019

We are unaware of new issues with 5.4 and cloud-init. There has been some things fixed, but if you used it with 5.3 it very well is expected to work the same with 5.4 too.

RokaKen · May 22, 2019

rhonda said:
In general I suggest upgrading one node at a time, and moving all containers and VMs off that node for the time being. Especially when it comes to kernel upgrades you want to reboot anyway.

Try to move VMs and containers to a node that has the same or a newer stack, not the other way round, to avoid potential side effects. There shouldn't be any between 5.3 and 5.4, but this is general advice.

I realize the above advice has been the "official" response and consistent among staff responses. However, I'm concerned with "unnecessary" rebalancing of PGs triggered by rebooting a node (and related side effects). I would like to propose the following for comment by staff and members with more extensive experience with CEPH:

Code:

# Node maintenance

    # stop and wait for scrub and deep-scrub operations

ceph osd set noscrub
ceph osd set nodeep-scrub

ceph status

    # set cluster in maintenance mode with :

ceph osd set norecover
ceph osd set nobackfill
ceph osd set norebalance
ceph osd set noout


    # for node 1..N

    #  migrate VMs and CTs off node

(GUI or CLI)

    # run updates

apt update && pveupgrade

    # determine which OSDs on node, and verify OK to reboot

systemctl status system-ceph\\x2dosd.slice

ceph osd ok-to-stop <id> [<ids>...]

shutdown -r now

    # wait for node to come back on line and quorate

    # next N

# restore

ceph osd unset noout
ceph osd unset norebalance
ceph osd unset nobackfill
ceph osd unset norecover



    # when all the PGs are active, re-enable the scrub and deep-scrub operations

ceph status

ceph osd unset noscrub
ceph osd unset nodeep-scrub


    # done

Thanks

sb-jw · May 22, 2019

@RokaKen normally it's enough to set noout, for scrubbing you should define some time ranges for the night or other times where is not much traffic.

On my setup I only set noout (which I have set generally because of the Cluster size) and don't have any problems up to today with this. There was no trouble in the cluster, if you have an correct crushmap there should be no reason for problems. If you have problems at planned maintenance, what are you doing if a node fails?

RokaKen · May 22, 2019

sb-jw said:
@RokaKen normally it's enough to set noout, for scrubbing you should define some time ranges for the night or other times where is not much traffic.

Thanks @sb-jw I was considering limiting the scrub time ranges, but those ranges will also coincide with planned maintenance windows most of the time. As for the rest, I didn't have any specific problem -- more my perception of unnecessary churn in cluster by dropping multiple OSDs during a node reboot. I realize that most options beyond 'noout' are overkill, but might provoke discussion. Node failure scenarios should probably be a different thread/topic.

Alwin · May 23, 2019

The default timeout before recovery starts, is 600sec. If the nodes boot faster then it shouldn't trigger a rebalance, but every new object written will be distributed either to other nodes or will be copied after the nodes is back up. In general data movement in some way or another is always to be expected. But as @sb-jw said, it should not be a problem otherwise there is an issue with the cluster per se.

EDIT: Ofc, setting noout is a good measure to do.

Search

Search

updating nodes with ceph

ethaniel86

Member

rhonda

Proxmox Retired Staff

ethaniel86

Member

rhonda

Proxmox Retired Staff

RokaKen

Active Member

sb-jw

Famous Member

RokaKen

Active Member

Alwin

Proxmox Retired Staff