Ceph issues - version and monitor

vivekdelhi

New Member
Sep 13, 2019
13
0
1
47
Hello All, I a just getting my feet wet with ceph and proxmox. I have set up a basic three node cluster. Each node has two HDDs, 300GB each, No RAID. The nodes have hostnames set and mapped with IPs and NTP has been set as well.
On /dev/sda, Proxmox has been installed with ZFS formated.
/dev/sdb, were kept free and ceph-OSDs created on each.
All work was done yesterday and today.
Then I initiated ceph installation.
However for some strange reason, on node2, Ceph version 12.2.11 got installed while its version 14.2.2 on node 1 and 3 !!!
Also there is mismatch between Monitors shown in Proxmox GUI and in shell.
Any suggestion on how to proceed please ?
Thanks
Vivek

Code:
# ceph status 
root@dell0104blade02:~# ceph -s
  cluster:
    id:     09fc106c-d4cf-4edc-867f-db170301f857
    health: HEALTH_OK
 
  services:
    mon: 2 daemons, quorum dell0104blade01,dell0104blade10 (age 2h)
    mgr: dell0104blade01(active, since 4h), standbys: dell0104blade10, dell0104blade02
    osd: 3 osds: 3 up (since 23m), 3 in (since 45m)
 
  data:
    pools:   1 pools, 128 pgs
    objects: 5.63k objects, 22 GiB
    usage:   68 GiB used, 1.0 TiB / 1.1 TiB avail
    pgs:     128 active+clean

Screenshot from 2019-09-20 18-58-31.png


Code:
# /etc/ceph/ceph.conf and etc/pve/ceph.conf 
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 192.168.15.31/24
         fsid = 09fc106c-d4cf-4edc-867f-db170301f857
         mon_allow_pool_delete = true
         mon_host = 192.168.15.31 192.168.15.204
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 192.168.15.31/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

The CRUSH MAP is as below

Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd

# types
type 0 osd
...

# buckets
host dell0104blade01 {
    id -3        # do not change unnecessarily
    id -4 class hdd        # do not change unnecessarily
    # weight 0.272
    alg straw2
    hash 0    # rjenkins1
    item osd.0 weight 0.272
}
host dell0104blade10 {
    id -5        # do not change unnecessarily
    id -6 class hdd        # do not change unnecessarily
    # weight 0.545
    alg straw2
    hash 0    # rjenkins1
    item osd.1 weight 0.545
}
host dell0104blade02 {
    id -7        # do not change unnecessarily
    id -8 class hdd        # do not change unnecessarily
    # weight 0.273
    alg straw2
    hash 0    # rjenkins1
    item osd.2 weight 0.273
}
root default {
    id -1        # do not change unnecessarily
    id -2 class hdd        # do not change unnecessarily
    # weight 1.090
    alg straw2
    hash 0    # rjenkins1
    item dell0104blade01 weight 0.272
    item dell0104blade10 weight 0.545
    item dell0104blade02 weight 0.273
}

# rules
rule replicated_rule {
    id 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
}

# end crush map
 
Any suggestion on how to proceed please ?

i would do the following:

1. make backups (and make sure they work)
2. remove all services/osds from the node with 12.2.11
2. upgrade ceph to 14.2.2 on that node
3. recreate the services
 
  • Like
Reactions: vivekdelhi
i would do the following:

1. make backups (and make sure they work)
2. remove all services/osds from the node with 12.2.11
2. upgrade ceph to 14.2.2 on that node
3. recreate the services
Thank You for the suggestions. I did that now all versions are consistent