[SOLVED] Proxmox 4.4 CEPH Drive Replacement Procedure

mjoconr

Renowned Member
Dec 5, 2009
88
1
73
Hi All

What is the procedure to replace a drive in CEPH Proxmox Cluster, I have a drive which is dying.

Thanks
Mike
 
I do not have any spare places to put a drive, so I assume the best option would be to set the weight to 0 and then remove the OSD and drive once its empty?
 
Yes
 
So I was able to remove the drive but now I have an issue, related to the cciss drives used by the controller and Proxmox.
The same error happens in the GUI.

root@blade1:~# ceph-disk zap /dev/cciss/c0d2
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
root@blade1:~# pveceph createosd /dev/cciss/c0d2
unable to get device info for 'cciss/c0d2'
root@blade1:~#
 
What steps do I need to do to get the drive added using manual ceph commands? The links given do not give any real answers to the problem.
 
Last edited:
This is for luminous release I'm running hammer as indicated and I've run the prepare command it failed because Proxmox 4.4 ceph supporting config meaning the cluster does not get set.
 
Hi
So I was able to work out how to get the new drive back in CEPH. using a HP bases controller using the CCCISS driver.
This is a quick overview of the steps I took

Remove data from Dying Drive:
ceph osd crush reweight osd.{ID} 0

Remove Dead/Dying Drive (or use the GUI):
ceph osd out osd.{ID}
systemctl stop ceph-osd@{ID}
pveceph destroyosd {ID}

The next issue is identifying the drive and for me, that meant using the HP tool 'hpacucli'
After getting the drive serial number using smart:
smartctl -i -d cciss,2 /dev/cciss/c0d2

I then used the serial number to identify the drives id on the controller
hpacucli ctrl slot=3 show config detail

Once I had this I was able to turn on the LED indicator on the drive bay.
hpacucli ctrl slot=3 pd ?:?:? modify led=on

Once I had re-inserted the drive I used hpacucli to tell the controller to use the drive.

Proxmox version 4.4 does not like the cciss system so I used the following to add the drive back into CEPH.
ceph-disk zap /dev/cciss/c0d?
ceph-disk prepare --cluster ceph --fs-type xfs /dev/cciss/c0d?

Cheers
Mike
 
Last edited:
  • Like
Reactions: Alwin

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!