PVE6 CEPH OSD Setup Fail lsblk not working

Ozlu28

New Member
Apr 12, 2021
4
0
1
40
Hello all,
I have been using proxmox for well over 7 years now. Really love the product!
Normally i can find the solution for my issues by searching this forum but this one issue seems to must be uncommon as i cant find details on it.

I currently have a 6 node cluster running 4 Compute nodes and 2 new R720s installed to start the switch to CEPH storage (going to be 4 as i migrate)

They are both fresh installs of PVE and i just moved them into the cluster. I installed CEPH on both of them and when I goto setup a OSD i get the following error:


command 'ceph-volume lvm create --cluster-fsid 3bac8555-d9ba-4064-a4b0-460ef1f76605 --crush-device-class ssd --data /dev/sdc' failed: exit code 1

Typing LSBLK in the shell i get this error:
lsblk: /lib/x86_64-linux-gnu/libsmartcols.so.1: version `SMARTCOLS_2.34' not found (required by lsblk)

All other nodes are up to date and have no issue. Even one that I installed ceph on to be a node 3, it has no problem with LSBLK. (This was on a HP blade chassis)

Both servers are similar, both are R720's and both have a Chelsio T580 40gbe.

Now i had a hell of a time getting these T580s to even be recognized so i am wondering if my efforts of installing the unified drivers may have somehow messed up the libsmartcols.so.1.

I tried to run the install and it says its already installed, uninstalling it will wreck the server.

I also tried to move the libsmartcols file from the working server with ceph to that server and same error comes up.

Any help would be much appreciated. Thank you!
 
First of, don't create a Ceph Cluster with only 2 nodes. You do need at least 3 nodes for a working Ceph Cluster.

Are you using these Chelsio NICs to access the disks? Did the chelsio driver touch the libsmartcols.so.1 file?

If you want to get a working version on those systems again, you first need to check which package it belongs to:
dpkg -S [I]/lib/x86_64-linux-gnu/libsmartcols.so.1[/I]

Which should return that it is libsmartcols1. You can then reinstall the package with
Code:
apt install --reinstall libsmartcols1
 
First of, don't create a Ceph Cluster with only 2 nodes. You do need at least 3 nodes for a working Ceph Cluster.

Are you using these Chelsio NICs to access the disks? Did the chelsio driver touch the libsmartcols.so.1 file?

If you want to get a working version on those systems again, you first need to check which package it belongs to:
dpkg -S [I]/lib/x86_64-linux-gnu/libsmartcols.so.1[/I]

Which should return that it is libsmartcols1. You can then reinstall the package with
Code:
apt install --reinstall libsmartcols1


Thank you for your help! I tried the reinstall command and still no good. I have managed to get one of the nodes back online with ceph installed after a full reinstall.

I made the mistake of reinstalling the node with the CEPH manager on it and it really made for a fun time trying to clear my CEPH cluster so i could reinstall CEPH haha man oh man.

Funny enough after the reinstall the T580 was recognized right off the bat with no need for any driver install. I wonder if i will be so lucky on the next one.

As for the amount of CEPH nodes. I have a total of 3 coming online now with a 4th one switching over once i migrate the data from the FREENAS R720 so i can install proxmox and make that node. I hope to have 5 or 6.

Thanks again for your response!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!