CEPH inaccurate OSD count displayed

Apr 11, 2019
16
0
6
24
I have had this issue for a while now, and after upgrading to Proxmox 6 and the new Ceph it is still there.

The problem is that the Ceph Display page shows that I have 17 OSD's when I only have 16. It shows the extra one as being down and out. (Side note, I do in fact have one OSD that is down and out due to a failure I have yet to replace. That is OSD 12).
one.png

And seen here you can see that there is no OSD.15 two.png

My Crushmap shows the same thing:
Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 16 osd.16 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host pve2 {
    id -3        # do not change unnecessarily
    id -4 class hdd        # do not change unnecessarily
    # weight 3.636
    alg straw2
    hash 0    # rjenkins1
    item osd.0 weight 0.909
    item osd.1 weight 0.909
    item osd.2 weight 0.909
    item osd.3 weight 0.909
}
host pve3 {
    id -5        # do not change unnecessarily
    id -6 class hdd        # do not change unnecessarily
    # weight 3.638
    alg straw2
    hash 0    # rjenkins1
    item osd.4 weight 0.910
    item osd.5 weight 0.910
    item osd.6 weight 0.910
    item osd.16 weight 0.910
}
host pve4 {
    id -7        # do not change unnecessarily
    id -8 class hdd        # do not change unnecessarily
    # weight 3.638
    alg straw2
    hash 0    # rjenkins1
    item osd.7 weight 0.910
    item osd.8 weight 0.910
    item osd.9 weight 0.910
    item osd.10 weight 0.910
}
host pve1 {
    id -9        # do not change unnecessarily
    id -10 class hdd        # do not change unnecessarily
    # weight 3.636
    alg straw2
    hash 0    # rjenkins1
    item osd.11 weight 0.909
    item osd.12 weight 0.909
    item osd.13 weight 0.909
    item osd.14 weight 0.909
}
root default {
    id -1        # do not change unnecessarily
    id -2 class hdd        # do not change unnecessarily
    # weight 14.549
    alg straw2
    hash 0    # rjenkins1
    item pve2 weight 3.636
    item pve3 weight 3.638
    item pve4 weight 3.638
    item pve1 weight 3.636
}

# rules
rule replicated_rule {
    id 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
}

# end crush map

And here is my Ceph config:

Code:
[global]
     auth client required = cephx
     auth cluster required = cephx
     auth service required = cephx
     cluster network = 192.168.1.1/24
     fsid = 8d0cfc48-5345-4d6c-866a-215d03b6e64f
     mon allow pool delete = true
     mon_host = 192.169.1.2 192.168.1.3 192.168.1.4 192.168.1.5
     osd journal size = 5120
     osd pool default min size = 2
     osd pool default size = 3
     public network = 192.168.1.1/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds.pve1]
     host = pve1
     mds standby for name = pve

[mds.pve3]
     host = pve3
     mds standby for name = pve

[mds.pve4]
     host = pve4
     mds standby for name = pve

[mds.pve2]
     host = pve2
     mds standby for name = pve

How can I correct this to make sure everything is displaying correctly?
 
I have had this issue for a while now, and after upgrading to Proxmox 6 and the new Ceph it is still there.

The problem is that the Ceph Display page shows that I have 17 OSD's when I only have 16. It shows the extra one as being down and out. (Side note, I do in fact have one OSD that is down and out due to a failure I have yet to replace. That is OSD 12).

Post the result of

Code:
ceph osd status
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!