VM IO hickup when restarting ceph osds

mohnewald · Apr 9, 2021

Hello,

i run proxmox with ceph. When doing maintance (like Updates, changes on the network, etc..) i somtimes need to restart the OSDs.

There are no VMs on the Node where i do the Maintance.

I think i read somewhere, that when you have replica/size 3, one file/block(?) is always the primary one which gets accessed for write/read.

Now if the primary one is on the OSD i restart, IO will hang until its back up or until it switches to the placement on the other OSD, correct so far?

This gives me some sort of IO hickup on my VMs, where virtio (vdX) does not seem to care, but older VMs with sdX devices seem not to like it at all. They show IO errors in dmesg.

Any hints on this?

Use the following to put it in "maintance mode":

ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover

Cheers,
Michael

mohnewald · Apr 9, 2021

what do you think about those steps to perform a smooth upgrade/maintanance:

1.) set noout first
2.) change set the primary-affinity to 0 on the affected OSDs ( https://ceph.io/geen-categorie/ceph-primary-affinity/ )
3.) do the maintanance (network changed, upgrades or whatever)
4.) change set the primary-affinity back to 1
5.) unset noout

Search

Search

VM IO hickup when restarting ceph osds

mohnewald

Well-Known Member

mohnewald

Well-Known Member

We value your privacy