Replace Journal / WAL SSD drive

lweidig

Active Member
Oct 20, 2011
104
2
38
Sheboygan, WI
We have a four node Proxmox cluster with all of the nodes also providing Ceph storage services. One of the nodes is having issues with the SSD that we are using for the journal / WAL drives (this is 5.1 / bluestore). We use a command like:

Code:
pveceph createosd /dev/sdc --journal_dev /dev/sdr --wal_dev /dev/sdr

to create each of the OSD devices. This would be where /dev/sdc is the mechanical drive and /dev/sdr is the SSD drive in the example. It would be /dev/sdr that needs replacing and it has journal / wal for multiple other drives in the setup. All drives are running in hot swap bays, so we are hoping this can be accomplished cleanly with a running system, but can of course bring the node down if there are no other options.

Appreciate any advice to making this a smooth (and hopefully no downtime) replacement.
 
If the '/dev/sdr' drive is still ok, then it might work to clone the drive. If not then you would need to remove all bluestore OSDs first and after the replacement re-create them back. On re-creation, you can skip the parameter of the wal_dev as it will be placed on the fastest disks of the OSD (eg. wla/db -> sdr , data -> sdc).
 
If the '/dev/sdr' drive is still ok, then it might work to clone the drive. If not then you would need to remove all bluestore OSDs first and after the replacement re-create them back. On re-creation, you can skip the parameter of the wal_dev as it will be placed on the fastest disks of the OSD (eg. wla/db -> sdr , data -> sdc).

Hi there!

I'm facing the same problem here...
If I do this, i.e, destroy the OSD and re-created I will lose data???
Thanks for any help.
 
If I do this, i.e, destroy the OSD and re-created I will lose data???
This depends upon your setup. If by default the pool size/min_size is 3/2 and the replication is on host level, with at least three hosts, than possibly NO.
 
Last edited:
I'm not sure what this mean... What you mean "replication is on host level"???
If you didn't change the crush rules, than by default the copies are distributed on host level. This means, that copies of the same object will not be placed on the same node.