[SOLVED] ceph replace disk and xfs fsck

liska_

Member
Nov 19, 2013
115
3
18
Hi,
smartd warned me about some problems with my drives I use for ceph and so I have two questions.
1] Is here any special fsck command to check ceph xfs filesystem or I would just use fsck.xfs /dev/sdX ?
2] What is the correct way to replace a drive, particullary not dead yet in my case?

I have found some info on add or remove osd http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ but maybe there is some better way how to replace the drive easily as it must happening very often.
I have red about recommended settings in this thread http://forum.proxmox.com/threads/20271-Ceph-High-I-O-wait-on-OSD-add-remove is this still valid?

I hope someone has more experiences dealing with failing drives and could write some info about it. Thanks a lot in advance
 
Last edited:
This was my solution to replaced that drive, as I got some inconsistent pgs and scrub error (it was not possible to solve it by repair or deep-scrub commands)
ceph osd out osd.15 --- wait for a data move via ceph -w, remains some active+clean+inconsistent pgs
/etc/init.d/ceph stop osd.15
ceph osd crush remove osd.15
ceph auth del osd.15
ceph osd rm osd.15
and now I replaced failed drive and new one was assigned with letter X
ceph-disk zap /dev/sdx
pveceph createosd /dev/sdx --- it should get the same osd number
ceph osd out osd.15 - I need to place osd to correct host bucket
/etc/init.d/ceph stop osd.15
ceph osd crush add osd.15 0.14 host=cl3 --- correct weigth and host
/etc/init.d/ceph start osd.15
ceph osd in osd.15
now it started to rebuilt, but some inconsistent remained, so it was necessary to repair them
ceph pg dump | grep inconsistent - get the pg name
ceph pg repair 4.306

Disk replaced and ceph health is ok again.