rejoin a failed same node to ceph

selcuk2

New Member
May 19, 2024
10
0
1
Hi everyone
My env: 3 node PVE ver: 8.1.3 with ceph ver: 18.2.1

one of my nodes failed due to boot disk crash. ceph disks are healthy. we plugged in new boot disk an installed proxmox. for the cluster setting, I deleted and rejoined the node to the cluster. now clsuter is healthy

my second task is to rejoin this node to ceph.
I installed ceph for this node but It installed 18.2.4. My problems:
Firstly; after installation it did not allow me to configure, it says config is already done. (I interpreted it a a good thing). but...
MONs
in GUI, new node monitor is there but "Host is unknown" and I can not start/stop/destroy it! I can not add a new mon with pveceph

Code:
pveceph mon destroy pve3
monitor filesystem '/var/lib/ceph/mon/ceph-pve3' does not exist on this node
root@pve3:~# pveceph mon create
monitor address '10.10.10.153' already in use

in the gui it says: mon.pve3 (rank2) is down (out of quorum)
a newer version was installed but old version still running please restart


it also requests me to upgrade other nodes from 18.1.2 to 18.1.4. How can I do that? what about the following?
Code:
apt-get update
apt-get upgrade ceph ceph-common ceph-fuse ceph-mds ceph-volume gdisk nvme-cli

OSDs
in GUI, "other cluster members use a new verson of this service, please upgrade and restart"

in GUI, I can see osd's in my rejoined node but osd's are down and out. I can not make them up and running. I also do not see osd services (systemctl start ceph-osd..) for my newly rejoined host.

the second thing comes to my mind is to use "pveceph purge" in this node and start from scratch for this node..


How can I proceed?

thank you for your help
 
I have successfullu deletes/readded mon after deleting /var/lib/ceph/mon/*

now I have problem with OSDs. I was able to make them IN but I am unable to start the OSD. in the new node, I see that /var/lib/ceph/osd is empty. same dir is filed with files in other nodes.

can ı use these OSDs or should I remove currents OSDs and readd from scratch..

my 2 question
how can I upgarde ceph. my new node is isnatlled as 18.2.4 while cluster nodes have 18.2.1. ı need to upgrade my cluster to 18.2.4

thanks..
 
this command worked for me
Code:
root@pve3:~# ceph-volume lvm activate --all

I have one osd which has no lvm on it. I think I will destroy and readd it to the ceph
 
I have deleted the OSD and recreated. it is automatically used by ceph to redistribute the data..

my only question is how can I upgrade ceph from 18.2.1 -> 18.2.4. this is required for existing cluster members to be the same version with the newly added one!

new node's ceph version: 18.2.4
existing node' version: 18.2.1

GUI tells me that ı need to upgrade older ones and restart OSD services