Ceph - osd "Problem" - on two nodes are the same osdid´s

Hi to all!

I have a three node cluster, with a ceph storage. It works perfect now.
There are three Servers, two of them has 8 harddisks, one has 2 HDDs. 1 for Proxmox, 7 for ceph storage.
Now my problem. Normally, there is only one osdid name for the whole three node cluster. I changed in the past harddrives, formated them right and now there are two osds with the same osdid, but only on pve2 it is shown at the Proxmox GUI.
On pve1 and pve2 there is the same osdid 9:

HTML:
root@pve1:~# df -h
Filesystem                     Size  Used Avail Use% Mounted on
udev                            10M     0   10M   0% /dev
tmpfs                          1.4G  492K  1.4G   1% /run
/dev/mapper/pve-root            34G  1.5G   31G   5% /
tmpfs                          5.0M     0  5.0M   0% /run/lock
tmpfs                          2.8G   59M  2.7G   3% /run/shm
/dev/mapper/pve-data            73G  180M   73G   1% /var/lib/vz
/dev/fuse                       30M   24K   30M   1% /etc/pve
/dev/cciss/c0d6p1              132G   48G   85G  36% /var/lib/ceph/osd/ceph-9
/dev/cciss/c0d2p1              132G   30G  103G  23% /var/lib/ceph/osd/ceph-1
/dev/cciss/c0d4p1              132G   25G  108G  19% /var/lib/ceph/osd/ceph-7
/dev/cciss/c0d5p1              132G   21G  112G  16% /var/lib/ceph/osd/ceph-8
/dev/cciss/c0d7p1              132G   29G  104G  22% /var/lib/ceph/osd/ceph-10
/dev/cciss/c0d3p1              132G   24G  108G  19% /var/lib/ceph/osd/ceph-2
/dev/cciss/c0d1p1              132G   23G  109G  18% /var/lib/ceph/osd/ceph-0


root@pve2:~# df -h
Filesystem                     Size  Used Avail Use% Mounted on
udev                            10M     0   10M   0% /dev
tmpfs                         1000M  492K  999M   1% /run
/dev/mapper/pve-root            34G  2.0G   30G   7% /
tmpfs                          5.0M     0  5.0M   0% /run/lock
tmpfs                          2.0G   59M  1.9G   3% /run/shm
/dev/mapper/pve-data            77G  180M   77G   1% /var/lib/vz
/dev/fuse                       30M   24K   30M   1% /etc/pve
/dev/cciss/c0d5p1              132G   30G  102G  23% /var/lib/ceph/osd/ceph-12
/dev/cciss/c0d3p1              132G   22G  111G  17% /var/lib/ceph/osd/ceph-6
/dev/cciss/c0d6p1              132G   18G  115G  13% /var/lib/ceph/osd/ceph-9
/dev/cciss/c0d7p1              132G   20G  112G  16% /var/lib/ceph/osd/ceph-13
/dev/cciss/c0d2p1              132G   20G  112G  15% /var/lib/ceph/osd/ceph-4
/dev/cciss/c0d4p1              132G   23G  109G  18% /var/lib/ceph/osd/ceph-11
/dev/cciss/c0d1p1              132G   19G  114G  14% /var/lib/ceph/osd/ceph-3


How can fix that problem to use the harddrivedisk at pve1 (osdid9)?

The command (on pve1) "pveceph destroy osd 9" don´t work:

HTML:
root@pve1:~# pveceph destroyosd 9
osd is in use (in == 1)
root@pve1:~#

Did anyone had the same "problem" in the past?

Thanks in advance,

roman
 
Last edited:
Hi to all!

I have a three node cluster, with a ceph storage. It works perfect now.
There are three Servers, every server has 8 harddisks. 1 for Proxmox, 7 for ceph storage.
Now my problem. Normally, there is only one osdid name for the whole three node cluster. I changed in the past harddrives, formated them right and now there are two osds with the same osdid, but only on pve2 it is shown at the Proxmox GUI.
On pve1 and pve2 there is the same osdid 9
...
Hi,
and which osd-9 is the running one?

can you post the output of following commands?
Code:
ceph osd tree
ceph auth get-key osd.9

# on both nodes
cat /var/lib/ceph/osd/ceph-9/keyring
Udo
 
Hi Udo, thanks for your reply!

osd9 is running on pve2

Here are the posts:

Code:
root@pve1:~# ceph osd tree
# id    weight  type name       up/down reweight
-1      2.59    root default
-2      0.7799          host pve1
0       0.13                    osd.0   up      1
1       0.13                    osd.1   up      1
2       0.13                    osd.2   up      1
7       0.13                    osd.7   up      1
8       0.13                    osd.8   up      1
10      0.13                    osd.10  up      1
-3      0.9099          host pve2
3       0.13                    osd.3   up      1
4       0.13                    osd.4   up      1
6       0.13                    osd.6   up      1
11      0.13                    osd.11  up      1
12      0.13                    osd.12  up      1
9       0.13                    osd.9   up      1
13      0.13                    osd.13  up      1
-4      0.9             host pve3
5       0.9                     osd.5   up      1




root@pve2:~# ceph osd tree
# id    weight  type name       up/down reweight
-1      2.59    root default
-2      0.7799          host pve1
0       0.13                    osd.0   up      1
1       0.13                    osd.1   up      1
2       0.13                    osd.2   up      1
7       0.13                    osd.7   up      1
8       0.13                    osd.8   up      1
10      0.13                    osd.10  up      1
-3      0.9099          host pve2
3       0.13                    osd.3   up      1
4       0.13                    osd.4   up      1
6       0.13                    osd.6   up      1
11      0.13                    osd.11  up      1
12      0.13                    osd.12  up      1
9       0.13                    osd.9   up      1
13      0.13                    osd.13  up      1
-4      0.9             host pve3
5       0.9                     osd.5   up      1


root@pve1:~# ceph auth get-key osd.9
AQDUXRFVaL0LGhAAExLD+Q4Fm9kOKt4RdC3F9A==root@pve1:~#

root@pve2:~# ceph auth get-key osd.9
AQDUXRFVaL0LGhAAExLD+Q4Fm9kOKt4RdC3F9A==root@pve2:~#


root@pve1:~# cat /var/lib/ceph/osd/ceph-9/keyring
[osd.9]
        key = AQC3iv1UIE3XMxAAyeH+eFnPKrHR/6h/iyLpyg==
root@pve1:~#


root@pve2:~# cat /var/lib/ceph/osd/ceph-9/keyring
[osd.9]
        key = AQDUXRFVaL0LGhAAExLD+Q4Fm9kOKt4RdC3F9A==
root@pve2:~#





Thats correct with pve3, ist only a PC with two HDD´s; one for proxmox, one for ceph.


Why is the keyring different to pve1?


Roman
 
Last edited:
Hi,
the key is different, because osd.9 on pve2 isn't used (I guess, you have used the disk on a first try as osd.9).

Simply overwrite the OSD and reuse them as new osd (will be osd.14).

like
Code:
# on pve2
# look which device is the unused ceph-9
mount | grep ceph-9

umount /var/lib/ceph/osd/ceph-9

sgdisk --zap /dev/DEVICE-OF-CEPH-9  # like sgdisk --zap /dev/sdh
Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!