[SOLVED] Can't seem to add any more OSDs

cxgl

New Member
Mar 31, 2019
3
0
1
54
Hi all,

I have a small cluster of 5 PVE nodes with 14 OSDs.

I had added a drive to one of the PVE nodes (non-HBA, so I had to set noout, shut down the system, add the drive to the irritating controller as a RAID0, then reboot). The OS itself sees the drive, and I can run pveceph createosd (which returns all success), but _nothing_ is created. No new osd in /var/lib/ceph/osd/.

I figured it was some oddity of the drive controller, even though three other drives are on it. So I just stood up another node (the fifth), added to the cluster, and attempted to add an osd. It *said* it completed successfully, but again -- no osd, and nothing in /var/lib/ceph/osd on this new host.

I don't see any errors in /var/log/ceph/*.log. I'm at a loss.

What should I be looking for?

Thanks in advance.
 
(non-HBA, so I had to set noout, shut down the system, add the drive to the irritating controller as a RAID0, then reboot)
This is trouble to be waiting for. I strongly recommend a HBA (more info in the link).
https://pve.proxmox.com/pve-docs/chapter-pveceph.html#_precondition

Besides that, another thing that comes into my mind, check 'ceph auth list' to see if there are entries of OSD auth keys with no OSD. If so, remove them and try again.
 
Alwin -- THANK YOU! Solved!

1) Yes -- not using an HBA is really asking for trouble. I agree. Unfortunately, these two Lenovo R630 servers only have an LSI2108 (which according to https://www.ixsystems.com/community...-firmware-for-lsi-megaraid-sas-9260-4i.40093/ does not support flashing to IT mode). Further, I briefly read something about the Lenovo R630 not accepting non-Lenovo-blesssed HBA cards. I'm still digging into that, but I'm swapping out these machines with something more friendly like the Dell R7x0 series.

NB: if anyone knows that the lenovo card (05:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)) actually *can* be flashed to IT mode, or if there is a relatively inexpensive HBA card said lenovo _will_ accept, please do educate me!

2) You were absolutely on the right path. ceph auth list showed some items in there, one of which was stale. I did a ceph auth del osd.8 (the stale osd), and was immediately able to ceph-disk zap, and recreate that osd. Then magically, the next osd (from the 5th node) showed up in the pve ceph list.

Again, thank you!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!