[SOLVED] Ceph 14.2.3 / 14.2.4

With that said, 14.2.4 was released today fixing issues with ceph-volume in 14.2.3 so I do appreciate the lag between releases in the upstream and proxmox releases!
 
We just uploaded Ceph 14.2.4 to our ceph test repository for Proxmox VE 6, the mentioned issue should be fixed there.

Depending on feedback we gather ourself and see from the community we plan to move this release to the main repo in between 5 to 10 days, as an estimate, Ceph releases are always a bit of work as it's a big project which still has a relative high change rate - so testing would be definitively appreciated.
 
Great news, just checked though and it unfortunately doesn't appear to have replicated to the mirrors yet...

For those being affected by this:
One can generally simply mitigate by restarting the downed OSD Ceph processes. Systemd would have tried this numerous times and failed though, so the service is often in a locked state.

Either restart the node or simply clear the lock and start the problematic OSD:
Code:
systemctl reset-failed;
systemctl restart ceph-osd@40;
 
Apologies, I fat fingered something.

For others:
Code:
pico /etc/apt/sources.list.d/ceph.list
  #deb http://download.proxmox.com/debian/ceph-nautilus buster main
  deb http://download.proxmox.com/debian/ceph-nautilus buster test

apt-get update; apt-get dist-upgrade;
systemctl restart ceph-target;


Upgraded fine and appears to be working so far:
Code:
[admin@kvm7f ~]# ceph versions
{
    "mon": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 3
    },
    "osd": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 60
    },
    "mds": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 3
    },
    "overall": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 69
    }
}

Every 2.0s: ceph -s

  cluster:
    id:     cf07a431-7d5c-45df-ab7d-7c92ea7b5cd1
    health: HEALTH_WARN
            noout flag(s) set


  services:
    mon: 3 daemons, quorum kvm7a,kvm7b,kvm7e (age 6m)
    mgr: kvm7b(active, since 6m), standbys: kvm7e, kvm7a
    mds: cephfs:1 {0=kvm7e=up:active} 2 up:standby
    osd: 60 osds: 60 up (since 3m), 60 in (since 4w)
         flags noout

  data:
    pools:   8 pools, 1345 pgs
    objects: 19.97M objects, 74 TiB
    usage:   127 TiB used, 183 TiB / 310 TiB avail
    pgs:     1345 active+clean

  io:
    client:   703 KiB/s rd, 6.5 MiB/s wr, 5 op/s rd, 44 op/s wr
    cache:    0 op/s promote
 
Last edited by a moderator:
we also upgraded. so far so good.

for a few months we will not use bluestore for vm's that have data changes, only vm's that can be restored from weekly backups without data loss.

filestore is reliable, although latency is higher.