[SOLVED] Lost Node in Ceph Cluster

Haider Jarral

Well-Known Member
Aug 18, 2018
121
5
58
38
I accidentally unplugged boot Hard Disk from one of the 11-node ceph cluster.

1614064885153.png

The hard disk is gone no way to recover.

Now my cluster shows 11 nodes and 1 node in gray.

I know I have to reinstall it from scratch. Just wanted to run this by to make sure I am on right path

1. Remove existing node from cluster
>pveceph delnode nodename

2. Install Same version proxmox on the lost node

3. Once its backup configure its network same way

4. Add it back to cluster

Is that it or should I also consider something else.

This is prod environment so wanted to make sure I have right steps jolted down.

Thank you.
 
Last edited:
Got this error trying to delete node


pvecm delnode sm3
Could not kill node (error = CS_ERR_NOT_EXIST)
Killing node 9
error during cfs-locked 'file-corosync_conf' operation: command 'corosync-cfgtool -k 9' failed: exit code 1
 
How are the nodes presented if you run pvecm status?
 
pvecm status
Cluster information
-------------------
Name: proxmox-cluster
Config Version: 23
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Sun Feb 28 23:20:59 2021
Quorum provider: corosync_votequorum
Nodes: 10
Node ID: 0x00000001
Ring ID: 1.f269
Quorate: Yes

Votequorum information
----------------------
Expected votes: 10
Highest expected: 10
Total votes: 10
Quorum: 6
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.1.11 (local)
0x00000002 1 192.168.1.14
0x00000003 1 192.168.1.10
0x00000004 1 192.168.1.16
0x00000005 1 192.168.1.15
0x00000006 1 192.168.1.17
0x00000007 1 192.168.1.18
0x00000008 1 192.168.1.19
0x0000000a 1 192.168.1.21
0x0000000b 1 192.168.1.13